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Preface 


(Michiel van Elk, Universiteit van Amsterdam) 


Thinking back about past events often involves a vivid memory of the people, the places 
and the context involved. Clear pictures of conference venues and cities that seem 
frozen in time come to mind when thinking about past scientific meetings. The vi- 
sual nature of our memories may be taken as an example of the embodied view of 
language and cognition, which is the general topic of this volume. On this account, 
our knowledge about the world is grounded in sensory and motor concepts that were 
acquired through bodily experience. For instance, the concept ‘to grasp’ entails a mo- 
tor representation of the hand action that is involved in actual grasping. In line with 
this suggestion, it has been found that the processing of action verbs is associated with 
activation in similar regions in the premotor cortex that are involved in the actual exe- 
cution of the action that the verb refers to (Pulvermuller, 2013). Similarly, understanding 
a concept like ‘grasping’ when observing the action of another person has also been as- 
sociated with activation in motor-related brain regions, suggesting that a process of 
motor simulation could support action understanding (Gallese & Lakoff, 2005). 

In the last decade, we have seen an enormous interest in embodied cognition theories 
among scholars from a wide range of different backgrounds. Cognitive neuroscientists 
have primarily investigated the when and how of activation in modality-specific brain 
areas in response to language and concept processing (van Elk, van Schie, & Bekkering, 
2014). Psychologists have experimentally determined the bidirectional relation between 
bodily and cognitive processing (Fischer & Zwaan, 2008). Philosophers have focused on 
the question whether embodied simulation processes meet the necessary and sufficient 
requirements to support higher-level processes such as mind reading or false belief un- 
derstanding (Jacob & Jeannerod, 2005). Linguists have investigated how our everyday 
use of concrete and abstract language in written and spoken form is related to basic 


sensory and motor concepts (Gibbs, 2003). 
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I am convinced that this multidisciplinary approach is one of the major strengths of 
embodied cognition. In a time in which many scientific disciplines have become increas- 
ingly specialized, a unifying theory that spans different domains and that ranges from 
developmental psychology to linguistics and from philosophy to dynamical systems 
theory has a great potential. At the same time, the challenges faced by such a multidis- 
ciplinary approach are non-trivial as each field is characterized by specialist problems 
that are often defined by the use of a specific jargon. This theoretical challenge was 
faced directly at the Sensory-Motor-Concepts in Language and Cognition meeting, in 
which linguists, philosophers, psychologists and neuroscientists participated — all with 
a shared interest in embodied cognition. As can be seen in the contributions to this vol- 
ume a wide range of topics was addressed from a variety of different perspectives and 
encompassing both experimental and theoretical contributions. An intriguing ques- 
tion is whether these different contributions are related and how they could lead to a 
cross-fertilization of ideas. 

A possible starting point for such an integrative attempt is to acknowledge that al- 
though the topics addressed by different disciplines may be different, they all share 
a similar conceptual framework. At this point, an interesting parallel can be drawn 
with evolutionary accounts of language. Starting from the premise that language con- 
ferred an adaptive advantage in the ontogeny of our species, different disciplines have 
focused on more proximate or ultimate causes of language development (Arbib, 2005). 
For instance, anthropological accounts have investigated the fossil records to determine 
precursors of the human vocal tract as a necessary prerequisite for the emergence of 
language. Developmental psychologists typically conduct experimental studies to in- 
vestigate how infants over the course of their first years acquire basic language abilities 
that often seem to go beyond the linguistic input that they received. Neuroscientists 
have elucidated the neural networks underlying language production and processing 
and have pointed out a striking overlap between the brain areas involved in the pro- 
duction of language and gestures, suggesting that gestural communication could be a 
precursor of a prototype of language. Thus, although differing in their topic of in- 
vestigation and their experimental approach, these findings converge on the idea that 
language should be understood in terms of its adaptive function and its relation to other 
more basic forms of action and communication. 

Similarly, within the framework of embodied cognition the different approaches con- 
verge on the notion that language and cognition involve the use of sensory motor con- 


cepts. This may be reflected in the use of metaphors referring to concrete sensory 
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motor domains, effects of concrete experiences on word reading and the activation of 
sensory motor brain areas in response to reading action verbs. Furthermore, each of the 
different domains can be characterized by similar discussions regarding the question 
whether an embodied cognition explanation is the only and most viable account of the 
extant data. For instance, embodied theories of conceptual content are often contrasted 
with amodal theories, according to which our thinking is based on an internal and sym- 
bolic ‘language of thought’ that is abstracted away from concrete experience (Mahon & 
Caramazza, 2008). One important argument that is often used in the debate between 
embodied and amodal theories of cognition is the grounding problem: it remains un- 
clear how concepts derive meaning if they are unrelated to concrete experiences (Barsa- 
lou, 2008). The embodied account proposes an intuitive and plausible solution to this 
problem: the meaning of concepts is derived from the fact that concepts are by defi- 
nition sensorimotor in nature. More recently, several authors have proposed a hybrid 
model according to which semantic processing involves both multimodal and modality- 
specific processing (Louwerse & Jeuniaux, 2010; Ralph, Sage, Jones, & Mayberry, 2010). 
These ideas may lead to a conceptual refinement of the current theoretical ideas and 
it would be interesting to see whether eventually theoretical integration is possible, not 
only within specific research domains such as neuroscience or psychology, but across 
different domains as well. The collection of papers in this volume provides an excellent 
first attempt for such an endeavor. 

Last but not least, I would like to acknowledge Liane Strébel without whom this 
project would not have been possible. She organized a stimulating conference and took 
the effort of making the proceedings of this meeting available in the form of this special 
issue of Diisseldorf University Press. It is my sincere hope that the discussions that 
were started throughout this project will be continued in the future and will lead to a 


further exchange of people and ideas. 


Introduction: 
Sensory Motor Concepts — at the Crossroad 
between Language & Cognition 


(Liane Strébel) 


This book presents selected papers from the conference “Sensory Motor Concepts in 
Language and Cognition” organized by the DFG Collaborative Research Center 991: 
“The Structure of Representations in Language, Cognition, and Science” and held from 
December 01-03 at the University of Diisseldorf, Germany. It brings together re- 
searchers working in the fields of computer linguistics, linguistics, literary, neuro- 
science, philosophy and psychology, whose work contributes to the interdisciplinary 
study of cognitive phenomena, specifically in the exploration of the role of sensory 
motor concepts for language and cognition in general. The aim of this book is to un- 
cover hidden potentials and available prospects of inter and trans-disciplinary research 
in the field of sensory motor concepts by defining common interests and objectives, and 
sketching paths for a fruitful interdisciplinary cross-fertilization, cooperative projects, 


and research transfer. 


What is so fascinating about sensory-motor concepts? 


According to Barsalou, mental representations used in cognitive tasks are grounded in 
the sensory-motor system. Therefore it is assumed that the human system of concepts 
cannot be regarded as either abstract or amodal, but as immediately anchored in the 
perception, experience and simulation of sensory-motor actions (Barsalou, 2008). This 
assumption is supported by the following facts: a) sensory-motor knowledge is the most 
specific and best-differentiated concrete human experience we possess, and b) sensory- 
motor concepts are not only conceptually simple and easy to encode given the fact 


that they are part of our everyday life, but due to their semantic complexity they can 
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also function as cognitive anchorage points for a diverse range of encoding strategies. 
Therefore, it comes as no surprise that we use sensory-motor concepts as a model for 
less specific, less differentiated, more abstract knowledge, such as emotions, needs or 
temporal and spatial relations. The mere fact that even the words to undersTAND and. 
to COMPREHEND (< Latin prehendére ‘to catch, to seize’) can be traced back to sensory- 
motor concepts and that we use sensory-motor-based metaphors, such as to GRASP an 
idea or to HANDle a problem underlines the predominance of sensory-motor source 
domains in the lexicon. But grammar, too, is full of morphemes which can be traced 
back to sensory-motor activities. One example is the way we refer to time, e. g. French 
le passé ‘the past’ (something that has gone by), MAINTENANT ‘now’ (< Latin manu 
tenendo ‘in the hand holding’) and l’avenir ‘the future’ (< Latin advenire ‘still to come’) 
or that we encode emotions or feeling with the help of a possessive verb related to hand 
action, such as I HAVE concerns, etc. Many light verbs and auxiliaries can also be traced 
back to hand or food actions, such as to GIVE a smile, to TAKE a walk, or I am Going 
for a swim, etc. Similar the copulae in Spanish can be traced back to bodily positions 
(e.g. SER [< Latin sedére ‘to sit’] or ESTAR [< Latin stare ‘to stand’]) or the negation 
in French to the denying of an action, such as to not TAKE a STEP (ne... pas ‘not a 
step’), etc. (Strobel, 2010, 2011). In all these examples the underlying strategy is based 
on the fact that not only the same brain areas are activated whether we fulfill or just 
imagine an action, but that we can also imagine a sensory-motor task, such as grasping 
an object without actually grasping it (Gallese and Lakoff, 2005) and that is exactly what 
makes sensory-motor concepts so suitable for rendering abstract entities less abstract 
by connecting them to concrete bodily actions (Strébel, 2014). 

The linguistic perspective is covered by theories in cognitive science which support 
this assumption by asserting that many concepts are grounded in sensory-motor pro- 
cesses (Barsalou, 2008; Gibbs, 2005; Pezzulo et al., 2011; Wilson, 2002). Psycholinguistic 
studies confirm that different sensorimotor experiences directly shape people’s use and 
understanding of complex situations and metaphorical statements. Neurological studies 
using neuroimaging techniques (e. g. fMRI, EEG) and also patient studies (Grossman et 
al., 2008) have furthermore provided several pieces of the puzzle concerning auditory 
language perception, reading and language production and deliver valuable insights 
into this highly developed cognitive function. 

The interdisciplinary interest in the topic is also reflected in this volume. Looking 
at the subject from a number of different perspectives, the various contributions here 


elaborate the fact that language and body are closely interrelated. 
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Sensory-Motor Concepts and Language 


The close connection between sensory-motor concepts and language is illustrated in the 
first part of this volume: Raymond Gibbs points out that much of everyday cognition 
and language has its roots in ongoing bodily experience. In his article, he describes a 
number of studies from the fields of experimental psychology and corpus linguistics 
and illustrates how metaphoric ideas and talk emerge from embodied simulation pro- 
cesses. Valentina Cuccio purposes a usage-based model of language. Taking the idea 
that speaking is acting as a starting point, she uses studies on action understanding in 
order to clarify language production and comprehension and to explain how inferential 
meaning is deduced from literal sentences. The close connection between sensory- 
motor concepts and metaphor is discussed by Johann-Mattis List, Anselm Terhalle 
and Daniel Schulzek. Analyzing traces of embodiment in Chinese character forma- 
tion, they underline the complex interactions between speaking, writing, and meaning. 
Wolfgang Miiller’s approach starts from the assumption that - much like emotions 
in actual life - emotions in literature are also grounded in the kinesthetic experience 
of the body. In his contribution, he illustrates that literature is a productive field for 


experimentation in matters of embodied cognition. 


The diversity of Sensory-Motor Concepts and its implications 


The diversity of sensory-motor concepts and its implications is highlighted in the sec- 
ond part of this volume: Gerard Steen divides the group of sensory-motor concepts 
into five subgroups, namely motor concepts, sensory concepts, sight concepts, sound 
concepts, location and direction concepts. Furthermore, he also points out that the dif- 
ferent groups of sensory-motor concepts are preferred in different registers and that 
a complete study of sensory-motor concepts would involve a four-way interaction be- 
tween sensory-motor concepts, metaphor, word class, and register. Ralf Naumann 
outlines a theory of action verbs that combines an abstract, modality-independent com- 
ponent with a modality-specific component located in certain regions of the premotor 
cortex. His proposal is based on the observation that a verb like kick can be used to 
express diverse types of actions that differ with respect to parameters (e.g. telic vs. 
atelic, result vs. no result or atomic vs. iteration). Sander Lestrade addresses the ques- 
tion whether we should analyze “place”, a generalized location, expressing the absence 


of a change of location, on a par with mode expressions specifying the type of such a 
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change, i.e. “source” and “goal”. In his paper, he discusses the status of place markers 
in a cross-linguistic sample of spatial-case inventories. Andrea Bellavia focuses on 
the connection between aspectuality and embodiment by analyzing a specific class of 
idiomatic constructions which systematically denote a change of location undergone by 
a body part at the source domain and which is metaphorically projected into the target 
domain denoting an event carried out in an intensive fashion. He is advancing a two- 
level integration model in order to display the semantic compositional representation 


of such idiomatic constructions. 


Sensory-Motor Concepts and Perception 


The close connection between sensory-motor concepts and perception is the focus of 
the last part of this volume: Lionel Brunel, Denis Brouillet and Rémy Versace’s 
approach is based on the close link between memory and perception and analyzes the 
influence of an auditory memory component upon the sensory processing of a sound 
by demonstrating the strong linkage between the access to our memory and the reac- 
tivation of the relevant sensory components, as part of the function of the respective 
context or the task. Martin Butz and Daniel Zöllner argue that progressively com- 
plex concepts and compositional structures can be developed starting from very basic 
perceptual and motor control mechanisms. They propose that the innateness of con- 
cepts may not be directly genetically imprinted, but concepts and compositional concept 
structures may be indirectly predetermined to develop due to the ontogenetic path laid 
out in the genes of the organism, the morphological constraints given by the body of 
the organism, and the environmental reality with which the organism interacts. Alex 
Tillas investigates the relationship between natural language and thinking. He takes 
as his starting point the assumption that thinking is imagistic, to the extent that con- 
ceptual thoughts are built out of concepts which, in turn, are built out of perceptual 
representations; and that concepts — the building blocks of thoughts — are association- 
istic in their causal patterns. His claim is supported by independent empirical evidence 


obtained from work done with aphasic subjects. 
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Abstract 

An important claim in cognitive science is that much of everyday cognition and lan- 
guage has its roots in ongoing bodily experience. One place where embodiment is 
critical is in the creation and use of metaphoric talk. This article describes some of 
the studies from experimental psychology and corpus linguistics demonstrating how 
metaphoric ideas and talk emerge from embodied simulation processes where people 
imagine themselves engaging in the actions mentioned in the language (e. g., “grasp the 
concept”). Some of this newer work demonstrates how experimental studies can test 
ideas from linguistics, but that corpus studies can also be used to examine falsifiable 
hypotheses first seen in psychology, on the embodied nature of metaphoric meaning. 


1 Introduction 


Embodied metaphor refers to the idea that many metaphoric concepts are grounded 
in recurring patterns of bodily experience (Gibbs, 2006; Lakoff & Johnson, 1999). For 
example, both “I am struggling to get a good start in my career” and “My marriage is on 
the rock” refers to the concept that LIFE IS A JOURNEY. People’s journey experiences, 
where they start at some source point, follow a path, and end up at some goal or 
destination, are used to better structured more abstract concepts like life or career or 
relationship. Much research in cognitive linguistics shows the importance of embodied 
source domains in metaphoric ideas and talk. 

To a significant extent, the experimental research on embodied metaphor is seen as 
verification for cognitive linguistic theories of embodied metaphor. But the rise of new 
work in corpus linguistics now sets the stage for a different kind of interdisciplinary 
collaboration between linguists and psychologists. This paper presents one example of 
this interaction between experimental psychology and corpus linguistics on the topic of 
embodied metaphor. My aim is to demonstrate some of the ways these two fields can be 


integrated; especially in regard to testing specific potentially falsifiable hypotheses. 
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2 Experimental Studies on Embodied Metaphor 


Many psycholinguistic studies have been conducted over the last 25 years to explore the 
ways that embodied metaphors may be recruited during people’s use and understanding 
of metaphoric language (Gibbs & Colston, 2012). These varied psychological findings, 
collected using a variety of experimental methods, indicate that the metaphorical map- 
pings between embodied source domains and abstract target domains partly motivate 
people’s understanding of the specific figurative meanings of many conventional and 
novel metaphors. 

For example, some experiments examined how immediate bodily experience influ- 
ence metaphor interpretations. In one series of studies on metaphorical talk about time, 
students waiting in line at a café were given the statement “Next Wednesday’s meeting 
has been moved forward two days” and then asked “What day is the meeting that has 
been rescheduled?” (Borodistky & Ramscar, 2002). Students who were farther along 
in the line (i. e., who had thus very recently experienced more forward spatial motion) 
were more likely to say that the meeting had been moved to Friday, rather than to Mon- 
day. Similarly, people riding a train were presented the same ambiguous statement and 
question about the rescheduled meeting. Passengers who were at the end of their jour- 
neys reported that the meeting was moved to Friday significantly more than did people 
in the middle of their journeys. Although both groups of passengers were experienc- 
ing the same physical experience of sitting in a moving train, they thought differently 
about their journey and consequently responded differently to the rescheduled meeting 
question. These results suggest how ongoing sensorimotor experience has an influence 
on people’s comprehension of metaphorical statements about time. 

One idea that has attracted a good deal of attention in cognitive science is the pos- 
siblity that much cognition and language is organized around embodied simulation pro- 
cesses (Gibbs, 2006). Several different behavioral studies provide support for the view 
that embodied simulations play some role in people’s immediate processing of verbal 
metaphors (Gibbs, 2006). People may create partial embodied simulations of speak- 
ers’ metaphorical messages that involve moment-by-moment “what must it be like” 
processes that make use of ongoing tactile-kinesthetic experiences (Gibbs, 2006). Un- 
derstanding abstract, metaphorical events, such as “grasping the concept,’ for example, 
is constrained by aspects of people’s embodied experience as if they are immersed in 


the discourse situation, even when these events can only be metaphorically and not 
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physically realized (i.e., it is not physically possible to grasp an abstract entity such as a 
“concept”). 

For instance, people’s speeded comprehension of metaphorical phrases, like “grasp 
the concept” are facilitated when they first make, or imagine making, a relevant bod- 
ily action, such as a grasping motion (Wilson & Gibbs, 2007). One unique study re- 
vealed that people walked further toward a target when thinking about a metaphorical 
statement “Your relationship was moving along in a good direction” when the con- 
text ultimately suggested a positive relationship than when the scenario alluded to a 
negative, unsuccessful relationship (Gibbs, 2012). This same difference, however, was 
not obtained when people read the nonmetaphorical statement “Your relationship was 
very important” in the same two scenarios. People appear to partly understand the 
metaphorical statement from building an embodied simulation relevant to LOVE RE- 
LATIONSHIPS ARE JOURNEYS, such that they bodily imagine taking a longer journey 
with the successful relationship than with the unsuccessful one. 

A different set of experiments examined people’s understanding of the embodied 
metaphor TIME IS MOTION by first asking people to read fictive motion sentences, as in 
“The tattoo runs along his spine” (Matlock, Ramscar, & Boroditsky, 2005). Participants 
read each fictive motion statement or a sentence that did not imply fictive motion (e. g., 
“The tattoo is next to the spine”), and then answered the “move forward” question (e. g., 
“The meeting originally scheduled for next Wednesday has been moved forward two 
days”). People gave significantly more Friday than Monday responses after reading the 
fictive motion expressions, but not the non-fictive motion statements. These results 
implies that people inferred TIME IS MOTION conceptual metaphor when reading the 
fictive motion expressions which primed their interpretation of the ambiguous “move 
forward” question. 

A follow-up group of studies had people engage in abstract motion to see if it in- 
fluenced their responses to the “move forward” questions (Matlock et al., 2011). Par- 
ticipants first filled in the missing numbers in an array that either went in ascending 
(e. g., between 5 and 17) or descending (e. g., between 17 and 5) order. When the partici- 
pants then answered the “move forward” question, they gave far more Friday responses 
after filling in the numbers for the ascending condition and gave more Monday answers 
having just filled in the numbers for the descending order condition. People appear to 
understand the meaning of time metaphors through a mental simulation of the implied 
motion, findings that are congruent with the claim that conceptual metaphors are active 


parts of verbal metaphor processing. 
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These different behaviorial studies offer support for cognitive linguistic claims about 
embodied metaphor, but do so in a more systematic manner that allows for specific 


hypotheses to be tested, and possible falisfied. 


3 Psycholinguistics and Corpus Linguistic Studies 


The experimental studies reviewed above all employed constructed examples, following 
most cognitive linguistic work on embodied metaphor. But there is now more emphasis 
in linguistics on corpus studies examining the use of metaphor in naturalistic discourse. 
For example, read the words path and road when they are used in the two different 
metaphorical contexts below, and consider whether they convey the same meaning 
(Johansson-Falck & Gibbs, 2012): 


1. The Spaniard lost 10-8 6-3 2-6 8-6 to Charlie Pasarell in 1967. And even if Agassi 
survives his first test, his path to a second successive final is strewn with trip wire, 
with former champions Boris Becker and Michael Stich top seed Pete Sampras 
and powerful ninth seeded Dutchman Richard Krajicek all in his half of the draw. 


[emphasis ours] 


2. The learner who is well on the road to being a competent reader does bring a 
number of things to the task, a set of skills and attributes many of which are still 
developing. He or she brings good sight and the beginnings of visual discrimina- 


tion. [emphasis ours] 


The meaning of path may be appropriate in (1) because of the uneven nature of 
Agassi’s journey toward winning the tennis match, while road seems apt in (2) be- 
cause the journey becoming a competent reader’s is well-established, and one that 
many people have metaphorically travelled. Previous corpus linguistic studies show 
that metaphorical uses of path, road, as well as way, are not only structured according 
to primary/conceptual metaphors such as ACTION IS MOTION, LIFE/A PURPOSEFUL ACTIV- 
ITY IS A JOURNEY, and PURPOSES ARE DESTINATIONS, but also appear to be influenced by 
people’s embodied experiences with the specific concepts that these terms refer to in 
their non-metaphorical uses (Johansson Falck, 2010). Thus, both similarities and dif- 
ferences between real world paths, roads and ways are reflected by how metaphorical 
paths, roads and ways are described both by the kinds and frequencies of obstacles that 
people face on these journeys, and the kinds of actions people engage in, on, or near 


metaphorical paths, roads or ways. 
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Johansson-Falck and Gibbs (2012) conducted two studies, one a psychological ques- 
tionnaire and the second a corpus linguistic investigation to see if embodied simulation 
processes are also prominent in people’s use and understanding of expressions like his 
path to a second successive final is strewn with trip wire in reference to Agassi’ metaphor- 
ical journey to a tennis tournament championship as seen in (1) above. Thus, people’s 
embodied simulation in regard to their imaginative understandings of traveling along 
different paths and roads provides a major constraint on what gets mapped in various 
metaphorical instances of path and road. 

A first study investigated people’s experiences with paths and roads. Participants 
were given a booklet that first asked them to create a mental image of “being on a path” 
and then, on the next page, to form a mental image of “being on a road.” Following this, 
the participants turned the page and saw a series of questions, each of which could be 
answered by circling either the word path or road. Analysis of participants’ responses 
revealed the following qualities that people strongly felt they experienced along paths 


and roads. 


Paths 
Something you travel on by foot 
More up and down 
More aimless in their direction 
Something you stop on more often 


More problematic to travel on 


Roads 
Straighter 
Wider 
Paved 
Lead to a specific destination 


Something you drive on 


Overall, the results of this first study employing human participants demonstrated 
that people’s imaginative perceptions of paths and roads focus on the more central 
rather than peripheral aspects of their bodily actions relevant to these real-world arti- 
facts (e.g. on driving, but not walking, on roads, and on walking, but not driving, on 
paths etc.). Traveling along paths is clearly different in important ways from that of 


roads. 
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A second study in this series provided a detailed corpus analysis of 240 metaphorical 
of path and and 47 instances of road in the British National Corpus. Most generally, the 
corpus findings matched the intuitions we obtained in our first psychological study. For 
instance, path was frequently used to talk of more difficult, and varied, difficulties in 
travel in these contexts (23 %), but roads were never used in this way. On the other 
hand, only 12% of the path examples, but 60 % (based on only 3 of 5 instances) of the 
road instances included explicit mention about where the artifact leads (i.e. to eternity, 
to ruin, to stardom). The same differences are seen in the ways that path and road are 
used to describe the target domain of PURPOSEFUL ACTIVITIES/LIVES. Again, there were 
many more mentions of the difficulties associated with travel along paths (38 %) than 
roads (13 %). These difficulties may be related to obstacles in or on the path/road (e. g., 
their path to a winning was obstructed by an excellent performance from India, or the 
constant traps and barriers laid by the forces that would block our path and drag us down), 
or they correspond to a difficult area that someone or something is leaving or trying 
to leave e. g., ([people] seek a path out of divisive ideological camps, or break though the 
barriers of error to seek the road to truth). 

Paths, but not roads, are connected with choices between alternative courses of ac- 
tion. 21% of the path instances with the function of describing PURPOSEFUL ACTIVI- 
TIES/LIVES, but none of the road cases included words or phrases suggesting that there 
may be more than one path to achieve a goal (e. g. only, best, the same, typical, a different 
path to the same goal).The term road, on the other hand, is more often used in talk about 
activities that people want to be efficient than paths (e. g., PURPOSEFUL ACTIVITY/LIFE 
and financial/political developments/processes), and paths are more often used to de- 
scribe actions or developments that may have a more hesitant, aimless, or step by step, 
quality than roads (e. g., COURSES OF ACTION/WAYS OF LIVING, other types of DEVELOP- 
MENT and paths in COMPUTER/MATHEMATICS DEVELOPMENTS/PROCESSES. Path is used in 
talk about processes and road in talk about ends of processes and result. Finally, path 
is more closely connected to choices between different courses of action, compared to 
the much more efficient and single goal-oriented road. 

The link between people’s imaginative understandings of paths and roads and the 
metaphorical uses of path and road in discourse has several theoretical implications. 
First, people mentally simulate different kinds of actions in journeys along paths and 
roads and apply these experiences to shape their in-the-moment metaphorical under- 
standings of abstract actions through the use of path and road. Second, the consistent 


patterns of findings for the psychological survey and the corpus investigation suggest 


24 


Experimental and Corpus Studies on Embodied Metaphoric Meaning 


that metaphorical language including terms that refer to artifacts is to some significant 
extent predictable. Most importantly, our combination of a psychological investiga- 
tion of people’s experiences of paths and roads with an extensive corpus analysis of 
metaphorical path and road shows that neither a conceptual metaphor theory explana- 
tion in terms of mappings at the levels of primary or complex metaphor, nor a purely 
social theory in which the use of path and road are negotiated between speakers, suf- 
ficiently account for the link between metaphorical meaning, mind and world. Instead, 
people’s imaginative perceptions of paths or roads are influenced by their understand- 
ings of these artifacts through embodied experience, which can then be simulated in the 


context of metaphoric thinking and speaking. 


4 Conclusion 


There is a large body of both experimental and corpus linguistic work on the embod- 
ied nature of many metaphoric concepts. The studies described in this article show how 
experimental and corpus research can nicely feed one another to create hypotheses that 
can be tested using either experimental or corpus linguistic methods. More specifically, 
cognitive linguistic studies strongly suggest that people’s recurring bodily experiences 
critically motivate aspects of their metaphoric talk. Psycholinguistic studies confirm 
that different sensorimotor experiences directly shape people’s use and understanding 
of various metaphorical statements. But the psycholinguistic work is limited in testing 
people’s immediate understanding of individual metaphors and does not explore the 
role of embodiment in larger discourse contexts. However, recent corpus linguistic re- 
search has demonstrated how specific hypotheses can be tested by examining detailed 
patterns of metaphoric language use within naturalistic speech and text (also see Ste- 
fanowitsch, 2011). This work shows that the metaphorical uses of certain words is not 
simply a social process or accomplished via the direct activation of encoded primary or 
conceptual metaphors. Instead, similar to the experimental research, corpus linguistic 
methods are capable of revealing the constraining presences of embodied simulation 
processes in the ways people think and speak of different abstract, and in this case 
metaphorical, concepts. In this way, then, corpus linguistic analyses do not simply offer 
ideas for possible testing using behavioral methods, but can be the site of testing explicit 
hypotheses themselves. 

Embodied experience seems critical to people’s use and understanding of metaphoric 


idea and language, a conclusion that vastly differs from traditional disembodied theo- 
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ries of metaphorical meaning and language use. Of course, many other factors, ranging 
from purely linguistic, social and cultural processes also shape the creation and inter- 
pretation of metaphoric discourse. But it is unlikely that any of these forces can act 
alone, apart from the influence of bodily activity. The studies described in this article 
provide additional evidence that the embodied nature of metaphoric concepts is best 
characterized in terms of embodied simulation hypotheses in which people imagine 
themselves engaged in the actual events mentioned in the language, even when these 


involves actions that are physically impossible to perform in the real world. 
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Abstract 

The aim of this paper is to focus on a problem that has not been sufficiently attended to 
by researchers in the embodied language paradigm. This problem concerns the inferen- 
tial level of communication. In real-life conversations implicit and inferential meaning 
is often the most important part of dialogues. However, embodied language researches, 
up to now, have not sufficiently considered this aspect of human communication. Simu- 
lation of the propositional content is not sufficient in order to explain real-life linguistic 
activity. In addition, we need to explain how we get from propositional contents to in- 
ferential meanings. A usage-based model of language, focused on the idea that speaking 
is acting, will be presented. On this basis, the processes of language production and 
comprehension will be analyzed in the light of the recent findings on action compre- 
hension. 

Keywords: Inferential Communication, Embodied Language, Motor Simulation 


1 Some remarks on the Embodied Language Paradigm 


According to many authors (Barsalou, 1999; Gallese 2008; Gallese & Lakoff, 2005; Pul- 
vermiiller, 1999, 2002) linguistic meaning is embodied. This means that the compre- 
hension of an action-related word or sentence activates the same neural structures that 
enable the execution of that action. Gallese (2008) presented this hypothesis as the 
“neural exploitation hypothesis”. Language exploits the same brain circuits as action 
does. According to this hypothesis, our linguistic and social abilities are grounded in 
our sensory-motor system. The Mirror Neuron System (MNS) is the neural structure 
that supports both our motor abilities and our social skills, language included. Thus, in 
this account, actions and language comprehension are mediated by motor simulation. 
We understand actions such as John taking a bottle from the refrigerator and drinking 
some milk, at least in part, by simulating the same actions in the Mirror Neuron System; 


and we understand a sentence such as “John took the bottle from the refrigerator and 


27 


Valentina Cuccio 


drank some milk”, at least in part, by simulating the corresponding actions in the same 
neural network that executes those actions. 

This seems to hold true even for the understanding of abstract linguistic meanings. 
Indeed, in that case, metaphorical thought allows us to map from a sensory-motor do- 
main to an abstract domain. This mechanism, according to Gallese and Lakoff (2005), 
is the basis for the construction and comprehension of abstract meanings and concepts. 

Now, imagine entering a bar, you look at the barman and say: “Water”. Or imagine 
being a firefighter, you are in front of a building on fire and you scream out loud to 
your co-worker: “Water!”. Imagine getting lost in the desert. At some point you see 
an oasis and say aloud to your exhausted friend: “Water”. In each of these cases, the 
word ‘water’ by itself expresses a full proposition, and it is a different proposition in 
each case (Wittgenstein, 1953; Lo Piparo, 2007). 

It is also vey likely that, in all of these examples, linguistic comprehension implies 
a mental simulation by the interlocutor. And it is also very likely that in these three 
different contexts the very same word will enables three completely different mental 
simulations. In the first case the simulation will probably concern the actions of putting 
water in a glass and giving the glass to a customer. In the second case, the simulation 
will concern the action of pumping water on the building using a fire hydrant. And 
finally, in the last example the interlocutor will comprehend that very same word as an 
information, “there is water over there”, and as an invitation, “let’s go to drink some 
water”. His mental simulations will most likely concern these linguistic contents. 

The very same word, then, can express full propositions with entirely different mean- 
ings. None of these possible meanings is literally present in the speech act. Indeed, 
propositions produced and comprehended in these examples are implicit and inferen- 
tial. Considering that, in the simulative account, language comprehension is realized 
by means of an embodied simulation of the propositional content, how can we explain, 
in this account, the simulation of a full proposition starting only from the uttering of 
a single word? 

Imagine now a boy that returns home. His father sees him and asks: “So?” and the 
boy answers with a smile: “It was fine”. This conversation can only be understood by 
someone who shares the same background knowledge as the participants. For example, 
the boy could have returned from an exam, a job interview, or from a date with a girl 
he really likes, and the father is asking about this. Thus, it is likely that in this case both 
the father and the son are performing a mental simulation. But is the mental simulation 


pertinent to the words “So” and “That’s fine” or to the implicit meanings that can be 
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inferred from those words? The latter is more likely. Consider that these very same 
words uttered in a different context by different people would have a very different 
meaning. 

The aim of this paper is to focus on a problem that only very recently has started to 
be addressed by researchers working in the embodied language paradigm. This problem 
concerns the inferential level of communication. In real-life conversations, implicit and 
inferential meaning is often the most important part of a dialogue. However, up to now 
embodied language researches have not sufficiently considered this aspect of human 
communication. 

Indeed the most influential model of language at work in embodied language re- 
searches is mainly based on the idea that we have semantic circuits in our brain where 
our linguistic knowledge, in terms of words meanings, is stored in a pretty stable way 
(Pulvermiiller 2002). Language comprehension, thus, implies the activation of our se- 
mantic knowledge that is often coded in terms of action, perception or emotion knowl- 
edge, according to the wittgensteinen idea that different word kinds impliy different 
form of knowledge (Pulvermiiller 2012). However, a semantic-based model of language 
understanding, that basically relies on a fixed and conventional repertoire of meanings, 
is not sufficiently explicative of what really happens when people speak. A simulation 
of propositional content does not sufficiently explain real-life linguistic activity. Indeed, 
the question that must be addressed is: what does it mean for the two utterances in the 
above dialogue to be subjected to a simulation of their propositional content. In ad- 
dition, we need to explain how we get from the propositional content to the implicit 
content and inferential meaning. Simulative understanding is “immediate, automatic 
and almost reflex-like” (Gallese 2007). Pulvermiiller (2012, 442) describes the brain pro- 
cesses that reflect comprehension as immediate, automatic and functionally relevant 
as well. However, can this definition of comprehension processes explain how we get 
from literal meaning to inferential meaning? This question should push us to reflect on 
the nature of automatic processes and to deepen out understanding of such processes. 
It could be that even automatic and subpersonal processes are sensible to the context. 
Findings from recent empirical studies support this hypothesis. Contextual effects on 
motor simulation during linguistic processing have been assessed in behavioural (e. g. 
van Dam, Rueschemeyer, Lindemann, & Bekkering 2010) and functional magnetic reso- 
nancge imaging (fMRI) studies (e.g. Papeo, Rumiati, Cecchetto & Tomasino 2012; van 
Ackeren, Casasanto, Bekkering, Hagoort, & Rueschemeyer, 2012). These findings sug- 


gest that contextual information prevails over semantics. However, how precisely this 
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happens is still an open question. Anyhow, these data raise an issue that all semantic- 
based account of language understanding should address. Also, not trivial philosophical 
implications on our understanding of what semantics really is and how it works and on 
the notion of automaticity should be drawn from these data. 

It is worth noting that in this paper it is not questioned the fact that language is em- 
bodied. Instead, the aim of the paper is to highlight the limitations that studies mainly 
focused on descriptive and action related usages of language inevitably have. These 
limitations have been mainly undervalued by researchers working in the embodied lan- 
guage paradigm. Even in those studies that addressed non-literal usages of language, 
experimental sets seem to miss a realistic pragmatic context that can trigger a process 
of inferential communication. They rarely take into account more pragmatically com- 
plex dialogues such as, for example, the one between the father and son previously 
discussed. Thus, if these kinds of stimuli, by far much closer to real-life linguistic ac- 
tivity, were taken into consideration, we would probably see that language production 
and comprehension imply the activation of the Mirror Neuron System in a peculiar, 
pragmatically-based, way. In other words, as some studies already suggest (Papeo et 
al. 2012; van Ackeren et al. 2012; van Dam et al. 2010), motor simulation occurring 
during linguistic comprehension is very likely contextually determined and not fixedly 
linked to the literal meaning of words. 

Consequentially, there is a second related problem that it is worth noting here. It 
concerns the definition of meaning and semantics adopted, sometimes implicitly some- 
times explicitly, in the embodied language paradigm. 

The language model adopted in this paradigm seems to be that of the dictionary. In 
the dictionary model of language, there is a fixed repertoire of words and each word is 
associated to a meaning. Of course, language seems to also show some imperfections 
such as polysemy and homonymy, but even these facts can be explained by the model 
of the dictionary. Indeed, each acceptation of a polysemic or homonym word works 
as if it were a different word with its own related meaning that we can eventually 
find in the dictionary. The word’s context allows the activation of the right meaning 
in any sentence. However, sometimes the context is too ambiguous, and this leads to 
misunderstandings. This appears to be the only room left for pragmatics in embodied 
language research (even when contextual effects are taken into consideration, these are 
considered as something outside the speaker that, in some way, interacts with fixed 


meanings stored in the speaker “heads”). 
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In contrast, the pragmatic dimension of language is more extensive than the problem 
of polysemy and homonymy even though they are more complex than what has been 
sketched-out here. A more comprehensive account of language should be provided in 


order to address issues concerning the pragmatic dimension of language. 


1.1 A Usage-Based Model of Language 


Since the first half of the nineteenth century, researchers in the fields of the Philosophy 
of Language, Pragmatics, Linguistics, Discourse Psychology and even Anthropology 
have been outlining a usage-based model of language. The vast and very rich literature 
on this topic numbers among its contributors philosophers such as Wittgenstein, Austin 
and Grice, linguists such as Levinson and Horn, discourse psychologists as Barlow and 
Kemmer and anthropologists such as Sperber. Although partially different currents of 
thought can be identified among these researchers, their accounts present some com- 
mon features. Hence, the next question to address is: what are the defining features 
of the usage-based model of language? 

A good starting point is an examination of semantics and its role in the construction 
of linguistic meaning. The key to understanding the role of semantics is the distinction 
between what is literally said and what is intended by the utterance of a sentence (the 
sentence’s meaning and the speaker’s meaning, in Grice’s words). This distinction in 
itself suggests that the semantic level only, with compositionality rules, is not sufficient 
in order to understand linguistic activity. A second, pragmatic, step of language com- 
prehension seems to be necessary. However, the problem is to determine to what extent 
the first semantic level can be considered autonomous from the pragmatic level of lan- 
guage. In other words, is there a residual literal meaning that we can call semantics 
or, should meaning be always considered as contextually determined at every level? In 
the latter option holds true, language understanding does not procede from a minimal, 
literal, proposition to the indended meaning. Pragmatic processes operate extensively 
at every level of language comprehension. 

Currently, in the pragmatic debate these two different accounts of the semantic/prag- 
matic distinction are known as Minimalism and Contextualism. However, indepen- 
dently of this debate, neither Minimalism nor Contextualism accepts the idea that a 
consideration of semantics as a fixed repertoire of meanings, can sufficiently explain 
the process of language production and comprehension. Semantics does not seem to 


be enough. In fact, if we look at what usually happens in real-life conversations again, 
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we will see that linguistic meaning is tightly linked to the context of speech, to the 
background knowledge of the speakers, to their shared knowledge and to their aims 
in that context (Carapezza & Biancini in press). To know the dictionary definition of 
each word plus the rules of their composition is not sufficient in order to receive the 
speaker’s meaning. 

We all perfectly know the corresponding definition of the words ‘so’, ‘that’, ‘is’ and 
‘fine’ in the dictionary. However, this knowledge is not sufficient in order to understand 
what the father and son in our example are talking about. Hence, to understand lan- 
guage we need to understand how, when, where, by who and why words are used. This 
idea leads to a definition of meaning that is very different from the one presented in 
the dictionary model of language. In this account, meaning is defined by the use of a 
word in a specific context. 

We can now turn to another point. Linguistic meaning is the product of a mutual 
identification of communicative intentions. Without the possibility of understanding 
other people mental states, and in particular their communicative intentions, language 
would be a mere code. Indeed, it is the ability to understand other people’s mental states 
and in particular their communicative intentions that makes irony, figurative language, 
jokes or even misunderstandings possible. If we only simulate the propositional content 
of an ironic utterance, how can we understand its ironic meaning? And how can we 
get the ironic meaning if we do not understand the presuppositions and implicatures 
of that sentence? And how can we understand the presuppositions and implicatures of 
a proposition if we do not understand other people mental states? 

In other words, how can we get the meaning of this sentence without implying a 
complex mindreading ability? 

This last point allows us to make a leap forward. Indeed, the key to understanding 
inferential communication is exactly a complex mindreading ability. The automatic, 
immediate and reflex-like form of mindreading realized by embodied simulation is not 
sufficient in order to explain inferential communication. 

Questions concerning the identification of the functional mechanisms of mindread- 
ing involved in real-life conversations and their neural implementation are still open. 


These issues will be discussed in the following paragraphs. 
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2 Becoming Ironic. How Do Children Develop an 
Understanding of Irony? 


Irony is a very clear example to highlight the role of mindreading in language compre- 
hension. Moreover, studies on the development of the ability to understand irony can 
help us to identify those steps of socio-cognitive development that we need to achieve 
in order to become ironic. 

Irony has been a widely addressed topic of study for more than two millennia. In the 
1%* century AD, the Roman rhetorician Quintilian defined irony as a figure of speech 
consisting in intending the opposite of what is literally said — contrarium quod dicitur 
intelligendum est. This definition is still very popular along with many others different 
theories of irony nowadays available. 

As Colston and Gibbs (2007) noted in their introduction to the edited volume “Irony 
in Thought and Language”, a host of different theories of irony have been presented and 
are currently discussed. And each of them seems to be able to explain only a part of this 
very complex phenomenon. For some researchers (Wilson and Sperber, 1992), irony 
implies an echoic reference to a desired or expected event while an undesired event is 
taking place. For others (Clark and Gerrig, 1984), irony is the realization of a pretence. 
The speaker is acting out the beliefs or behaviours of others and in doing so he is taking 
distance from them. 

These two accounts are just examples, though influential, but by no means represen- 
tative of the huge quantity of theories of irony that are presently discussed (see Colston 
and Gibbs, 2007 for a review of contemporary theories of irony). 

However, despite the number of different definitions, irony is, beyond all doubt, 
a very good example of inferential communication. This is true for many reasons. 
In order to receive the ironic meaning of an utterance, we need to understand the 
presuppositions and implicatures of that utterance. Indeed, the use of irony implies, 
at least, a form of violation. Irony can express the violation of expectations (Colston, 
2000; Kumon-Nakamura, Glucksberg, & Brown, 1995; Wilson and Sperber, 1992), the 
violation of relevance, appropriateness and manner (Attardo, 2000), or the violation of 
the Gricean Maxim of quality (Kumon-Nakamura et al., 1995). In any case, each of 
these forms of violation entails a presupposed shared knowledge. Indeed, in order to 
feel that something is the expression of a violation, we need to know, implicitly or 
explicitly, that something different should have been the case in that context. Speaker 


and addressee need to share this knowledge and they need to reciprocally know that 
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they share this kind of knowledge. If not, irony will not succeed. Moreover, if irony 
succeeds, we understand the meaning of the speaker’s intentional violation. And this 
meaning is not explicitly expressed, the speaker and addressee need to implicate it. 
Thus, the processing of irony entails the ability to manage with presuppositions (the 
shared knowledge) and implicatures (meanings inferred from violations). Furthermore, 
the addressee needs to comprehend the goal of the speaker in order to understand his 
ironic meaning and to make reference to context (both the physical context of speech 
and the background knowledge of the speaker and the addressee). These issues hold 
true for many other language usages, but in irony comprehension they are particularly 
evident. 

How can we explain the process of inferential understanding in an embodied ac- 
count? That is, how can we explain the comprehension of something that is not literally 
present in the sentence but only presupposed and implicated by it? Can we hypothe- 
size that it is a chain of simulations that leads to the inferential, ironic meaning? Does 
this chain of simulation need to start with the simulation of the propositional content 
or not? Does the process of inferential understanding need to be implicit or explicit? 
These are empirical open questions that are waiting for experimental studies. 

A look at the development of irony-understanding might help to clarify these exper- 
imental questions. Indeed, developmental studies can help us to identify the cognitive 
mechanisms necessary for irony-understanding and this could make the task of looking 
for their neural implementation easier. 

Why do developmental studies of irony matter? Developmental studies on irony 
tell us something about the step of cognitive development that is necessary in order to 
produce and understand irony. These studies are focused on the identification of the 
social-cognitive mechanisms needed in the production and understanding of irony. On 
the other hand, studies on the production and comprehension of irony in adults seem to 
be more focused on the pragmatic description of the phenomenon. Adults studies seem 
to be interested in the social functions of irony, in its communicative effects, in the role 
played by the context in the construction of ironic utterances and so on and so forth. 
They do not seem to be strictly focused on the identification of the social-cognitive 
mechanism underlining the use of irony as developmental studies would (Filippova and 
Astington, 2010). 

As Filippova and Astington (2010) have recently claimed, much of the research that 
has been carried out in the developmental line of study (e. g., Happé, 1993, 1995; Sulli- 
van, Winner, & Hopfield, 1995; Winner, Brownell, Happé, Blum, & Pincus, 1998; Winner 
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& Leekam, 1991) has highlighted the fact that the ability to make second-order mental 
state attributions is required in order to be able to produce and comprehend irony. This 
claim is so strong in developmental studies that the production and comprehension of 
irony is often used as a test for evaluating the possession of a sophisticated mindreading 
ability, i. e. a full Theory of Mind. Indeed, Theory of Mind, the ability to attribute mental 
states to other people and to understand them shows a gradual development. It is possi- 
ble to identify different levels of Theory of Mind. The first entails the ability to implicitly 
attribute intentions, mainly motor intentions, to others. The second level implies the 
capacity to explicitly reason about other people mental states (desires, beliefs, inten- 
tions, etc.). A third level implies the ability to reason about other people mental states 
concerning, in their turn, other people’s mental states (e.g. “I know/believe/predict 
that John knows that Mary knows”). Accordingly, different kinds of Theory of Mind 
tests, such as the false-belief test, are usually run. Clements and Perner (1994), using 
an anticipatory looking paradigm, showed false belief understanding in 2 years and 11 
month-old children; in Southgate et al. (2007), the age of false belief understanding 
was lowered to 25 months using the same experimental paradigm. Recently Buttelman, 
Carpenter and Tomasello (2009) carried out a study using an active helping paradigm. 
This study showed false belief understanding in 18 month-old infants. In these stud- 
ies, children are not requested to explicitly and verbally reason about other people’s 
intentions. Their helping behaviours and their eye gaze directions seem to suggest false 
belief understanding. 

A false-belief task can also be explicit and verbal and it can test first and second or- 
der mental representations. Indeed, in the “Anne and Sally” test (Wimmer and Perner, 
1983) the experimenter asks children about Anne’s (false) belief or asks about what 
Sally knows that Anne knows. The former is a first-order mental representation test, 
it is passed by children around the age of 4 years; the latter is a second-order mental 
representation test and children are usually able to pass the test only after their 4th 
birthday. The use of irony is considered as a proof of a full Theory of Mind ability. 
In fact, many studies carried out with both typically and atypically developing chil- 
dren seem to suggest that the understanding of second-order mental representations 
is needed in order to acquire irony (Happé, 1993, 1995; Sullivan, Winner, & Hopfield, 
1995; Winner, Brownell, Happé, Blum, & Pincus, 1998; Winner & Leekam, 1991). Al- 
though there is not a general agreement on the exact age at which children start to use 
irony, this is, beyond all doubt, a later achievement in language acquisition. According 


to some researchers (Demorest et al. 1983, 1984) children become competent ironists 
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at about 13 years of age. According to others (e. g., Harris & Pexman, 2003; Sullivan 
et al., 1995; Winner & Leekam, 1991; see Filippova and Astington 2010 for a review) 
children of 6 years of age can already comprehend some form of irony. As Filippova 
and Astington argue, this difference may be due to the fact that those studies looked 
for different aspects of irony understanding. Moreover, they might show evidence of 
a gradual development of irony comprehension. In any case, even the results attesting 
irony competence at six years of age are fully compatible with the claim that irony en- 
tails second-order mental states understanding. Indeed, results by Perner and Winner 
(1985) attest understanding of second-order mental states at around the age of six or 
seven years. 

Very briefly, we can say that irony entails the ability to go beyond the propositional 
meaning of an utterance, which sometimes can be literally true and sometimes can be 
literally false, and to grasp a speaker’s intended meaning through the recognition of a 
form of violation. In order to carry out this inferential process, a complex mindreading 
ability seems to be necessary. Indeed, psycholinguistic studies carried out in typically 
and atypically developing children verify the necessity of a second-order mindreading 
ability in order to produce and comprehend irony. 

Irony is then a paradigmatic example of inferential communication. Studies on 
the development of irony understanding offer us some hints about the socio-cognitive 
mechanisms that are necessarily involved in the development of inferencial abilities in 
language production and comprehension. Most of the studies on embodied language 
seem to still disregard the question of how this inferential process works during lin- 


guistic activity and where and how in the brain it is implemented. 


2.1 Speaking is Acting 

In a recent article by Friedmann Pulvermiiller (2012), the sketch of a neurobiological 
model of language is preceded by an introduction about semantic theories. Importantly, 
Pulvermiiller introduces pragmatic concepts in the embodied language research. In- 
deed, the ideas of the philosopher Ludwig Wittgenstein are given plenty of room in this 
introduction. In particular, Wittgenstein’s notions of “meaning as usage” and “word 
kinds” are presented. There are different kinds of meaning that lead to different kinds 
of words and, Pulvermiiller says, each kind leads to the activation of a different area 


of the brain. So, for example, we have object-words, action-words or emotional words. 
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Semantic knowledge, in these word kinds, is coded in our brain respectively in terms 
of perception knowledge, action knowledge or emotional knowledge. 

However, despite the interesting discussion of these wittgensteinian notions, the 
account of semantics that Pulvermiiller proposes is complety describable according to 
the dictionary model of language. In fact, his account is grounded on the idea that 
semantics is made up of the binding of a word form and a kind of meaning knowledge. 
And that language comprehension is the act of connecting the word form to the right 
knowledge, i.e. to a pattern of neural activation. Pulvermiiller does not really look at 
usages of words in speech act contexts, that was one of Wittgenstein main concerns 
and one of the most interesting aspects of his philosophical legacy. The problem of 
how intentions, background knowledge, context, etc. ..., come together to construct 
meaning is not addressed by Pulvermiiller nor by most of the other reserchers working 
in the embodied paradigm. 

Boulenger, Hauk and Pulvermiiller (2009) carried out a fMRI study on idiom com- 
prehension, considered as examples of non-literal meaning. This study compared the 
comprehension of literal and non-literal sentences (idiomatic) containing action-related 
words. The authors found that the comprehension of both literal and idiomatic sen- 
tences containing action-related words led to somatotopic activation along the motor 
strip. These findings were further confirmed in a later study carried out by Boulenger, 
Shtyrov and Pulvermiiller (2012) using a different technique (MEG - MagnetoEncephalo- 
Graphy) that affords more temporal information about brain processes. Data from this 
second study revealed somatotopic activation of precentral motor systems during the 
processing of both literal and idiomatic sentences containing action-related words. 

However, despite the fact that these studies take into consideration forms of non- 
literal meaning, they seem to be very far away from the goal of understanding infer- 
ential communication in real-life linguistic activity. Indeed, participants of both studies 
read sentences (e. g. “Pablo kicked the habit” and “Pablo kicked the ball”) on a computer 
screen, without any contextual information. This means that participants did not have 
to face any pragmatic task that could have triggered inferential understanding and, con- 
sequently, for example, a different modality of recruitment of the motor system. If we 
utter the sentence “Pablo kicked the habit” in a real-life conversation in order to talk, 
for example, about a friend that has stopped smoking, would the pattern of neural acti- 
vation be exactly the same? We can hypothesize that, on the basis of our background 


knowledge, the idiom is interpreted as “Pablo stopped smoking” and the somatotopic 
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activation in the motor system could, thus, pertain to the action of smoking and not the 
action of kicking. 

It is now possible to turn to another issue of pragmatics studies that seems to be 
undervalued in the embodied language researches when it might be very important in 
order to understand how language works. This issue concerns the definition of language 
as action. To speak is never just a mere neutral description of states of affairs. Speaking 
always implies the carrying out of both a physical and a social action. By using irony, 
we can ridicule or praise someone; with a declaration we can start a war, a love affair, 
or a hearing in the court; with words we can apologize, we can get married, we can 
name children or boats. And the list could go on infinitely because the social actions 
carried out by language are potentially countless. It is important to note that speaking 
is also an action in the physical sense. Indeed, speaking implies the movement of the 
oro-facial muscles and often of the hands, which can be involved in co-speech gesturing 
(or hands and co-sign mouthing in the case of sign languages). 

Therefore, this should lead researchers to look at language as the performance of 
physical and social actions. Speaking is acting in a broader sense than just naming 
objects, actions or abstract concepts. By speaking, we always want to do something. In 
fact, many of the actions that make us human can only be carried out in language. 
Speaking implies some kind of background knowledge, goals and intentions; it im- 
plies physical movements and it has social effects. On the whole, non-linguistic in- 
tentional actions seem to share these very same features. And besides, linguistic ac- 
tivity entails communicative intentions, mainly not present in non-linguistic and non- 
communicative actions. 

However, often linguistic actions are undervalued and what is taken into account is 
only the process that links a sign, i.e. a word form, to a meaning. 

The definition of language as action has been widely discussed by philosophers of 
language like Austin and Wittgenstein. However, researches working in the embodied 
language paradigm, despite the fact they were greatly responsible for the discovery 
of empirical evidence in support of the claim that language is deeply grounded in the 
brain systems for action and perception, seem not to consider speaking as being an 
action itself. When I say “Pablo kicked the ball” or “Pablo kicked the habit” I have an 
intention and I expect my action to have an effect in the real world. And I presuppose 
that you share the knowledge with me that will allow you to understand what I am 


saying. 
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Imagine that I want you to hire Pablo in your company, but you do not agree with 
me because Pablo has been having trouble with alcohol. I come to your office and 
say: “Pablo kicked the habit”. This utterance is sufficient to let you understand my 
request. Without a sophisticated and mutual recognition of intentions and beliefs, this 
linguistic exchange could not work. Furthermore, how could I perform this action of 
requesting without language? Humans, then, have a very complicated kind of action, 
linguistic actions. Hence, we should look at language from the same perspective we use 
to understand action. 

This leads us again to the problem of the mindreading systems needed in order to 


understand action/language. 


3 Comprehending Others’ People Actions 


If speaking is acting (the speaker is performing an action and the addressee has to 
interpret the speaker’s action), studies on action understanding can help us to clarify 
language production and comprehension. In particular, these studies could help us in 
the task of understanding how the mindreading ability is involved in the construction 
of meaning. How do we get inferential meaning out of literal sentences and what is the 
role of mindreading in the construction of inferential meaning? 

Recently, many works have been devoted to the task of identifying the neural mech- 
anisms that support our ability to understand other people mental states. This ability 
seems to be necessary for action understanding (see Frith and Frith 2006 for a review). 
In fact, as Frith and Frith argue (2006, 531), mental states determine actions. 

Very often the inferential process of mentalizing is carried out automatically. This 
means that it does not entail conscious thought or deliberation. 

Often, when we are involved in the task of understanding other people actions, im- 
plicit and automatic inferences are carried out in the Mirror Neuron System. However, 
simulations carried out in the Mirror Neuron System cannot always explain the full 
process of understanding others’ goals and intensions (Frith and Frith, 2006; Mitchel, 
Macrae and Banaji, 2006). For example, as Mitchel, Macrae and Banaji argue (2006), 
motor simulation cannot explain long-term attitude. The question is still under debate. 
Despite the fact that mindreading seems to be a very important function, its neural im- 
plementation seems to be still controversial. In particular, while the role of the Mirror 
Neuron System is less controversial in order to understand motor intentions of familiar 


actions, the possibility of a different neural implementation is under consideration for 
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a more sophisticated form of mindreading that would allow for the understanding of 
non-familiar actions. 

Following Brass et al. (2007), it is possible to say that we have two different ac- 
counts of the systems that allow us to interpret other’s behaviours. According to one 
of them, based on the process of motor simulation, we understand others’ actions by 
simulating them through the activation of the mirror neuron system. According to a 
second account, action understanding is realised by means of inferential processes im- 
plemented in non-mirror circuits of the brain (Brass et al., 2007). The findings of Brass 
and colleagues (2007) support the idea that action understanding in novel and implau- 
sible situations is primarily mediated by an inferential interpretive system rather than 
the mirror system. Following the authors, an action is implausible if its goal is not 
obvious but required context-based inferencing. According to the authors, implausible 
action understanding activates a brain network involved in inferential interpretative 
processes that lack mirror properties (Brass et al. 2007). No differential activation was 
found in the mirror neuron system in relation to the contextual plausibility of observed 
actions. 

Then, in this model the comprehension of implausible action is the result of a 
context-sensitive inferential process of mentalizing. 

Turning again to the problem of language production and comprehension, what kind 
of mindreading mechanism is at work when we produce and comprehend linguistic 
actions? And in particular, what kind of mindreading mechanism is at work in the un- 
derstanding of inferential communication (e. g. irony, jokes or the daily conversations 
such as the one previously discussed)? 

In light of the findings of Brass et al. (2007), it is reasonable to hypothesize that in 
the understanding of inferential meaning in daily communication we also need a more 
complex and inferential form of mindreading that should be involved, being an integral 
part of it, in the dynamic process of the construction of meaning. It is plausible that 
this mechanism interacts with other mechanisms also involved in linguistic compre- 
hension, such as the mechanism of motor simulation. These considerations push us 
to deepen our understanding of the role of contextual effects on language and action 
understanding. Furthermore, these considerations push us to reflect more on the role 
of these contextual effects on automatic mechanisms such as the mechanism of motor 


simulation. Only further empirical studies can help to clarify these issues. 
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Abstract 
In this paper, we develop a frame approach for modelling and investigating certain pat- 
terns of concept evolution in the history of Chinese as they are reflected in the Chinese 
writing system. Our method uses known processes of character formation to infer dif- 
ferent states of concept evolution. By decomposing these states into frames, we show 
how the complex interaction between speaking, writing, and meaning throughout the 
history of the Chinese language can be made transparent. 


1 Introduction 


In this paper, we discuss the complex interaction of the written form, spoken form 
and meaning in Chinese. We show that conceptual processes such as metonymy or 
metaphor and the sensory-motor grounding of human conceptualization are reflected 
in Chinese character development. Our analysis is based on the modelling of conceptual 
processes by means of a frame-based approach to character formation. 

After introducing the notion of embodiment and its role for language development 
and linguistic analysis, we point out some general properties of the Chinese writing 
system, i.e. Chinese character forms, their place in traditional sign models and prin- 
ciples of character formation. We then give a short introduction on how concepts can 
be modelled as recursive attribute-value structures called frames. The main section con- 
sists of a frame-based analysis of selected character formation processes which illustrate 


the different ways phonemic, graphemic, and semantic components interact. 
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2 Embodiment and language 


The term embodiment refers to a number of partly overlapping theories whose common 
denominator is the claim that cognition requires the interaction of a body with the 
world (Wilson 2002, Ziemke 2003). The view we adopt in this paper is that abstract 
concepts evolve on the basis of concepts which arise from perception and action. This 
approach is taken by Barsalou (1999) who proposes that concepts are constructed from 
perceptual symbols, i.e. subsets of modal representations which are stored in long-term 


memory and reused symbolically to stand for objects in the world. 


2.1 Conceptual development and language reconstruction 


Lakoff and Johnson (1980) were the first of now many linguists (e.g. Gibbs 2003 and 
Steen 2010) to underline the fundamental role that metaphor plays in the construction of 
abstract concepts based on physical concepts. They postulate that systematic correlates 
between emotions (such as happiness) and more basic sensory-motor experiences (such 
as an erect body posture, which is supposed to be often concomitant with happiness) 
lead to the metaphorical understanding of the more abstract concept on the basis of 
the concept resulting from the perceptual experience (Lakoff 1980: 58). This conceptual 
relation is reflected in language where words like up and down stand for spatial concepts 
as well as for emotional states: cheer up!, I’m feeling a bit down, we’ve had our ups and 
downs. 

Thus, the word up preserves information regarding the sensory-motor source concept 
which underlies the abstract emotional concept. The link, which allows the inference 
that there is a relation between the two concepts, is the fact that they are associated 
with the same sound chain [ap]. Moreover, the emotional concept became a meaning 
of up only recently, whereas the spatial meaning is close to that of the Indo-European 
etymon “upo <under, from under» (Pokorny 1959). 

Not all cases are phonetically and morphologically as transparent as *up, which 
means that more reconstruction work concerning the “formal part of the linguistic sign 
is necessary to be able to draw “conclusions about the semantic side. The sound chain 
of the Latin word “capacitas <ability> goes back to the Indo-European root “kehzp- <to 
seize, to grasp» via Latin * capere <to seize» — or to the non-laryngealized * *kap-, which 
cannot be excluded — (Georges 1998, Rix et al. * *2011), and French [Jef] <boss, * chief 
stems back from Latin [kaput] <head> (Gamillscheg 1997, see Figure * 1 and Figure 2), 


which in turn might be derived from the root of Latin * capere as well (Vaan 2008). 
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ax [kapere] [kapa:ks] [kapa:kita:s] 
=> > — 

\ er / <grasp> in <ability> 

Indo-European Latin Latin Latin 


Fig. 1: Etymology of Latin capacitas. 


[kaput] *kapum [tfief] [fef] 
—> —> —> 
aeo / <head> «center, boss> «chief, boss> 
Latin Gallo-Romance Old French Modern French 


Fig. 2: Etymology of French chef 


Independently of the morphological * transparency, the genetic relation (or identity 
as in the case of * up) between the sound chains can thus be seen as a trace of the * 
sensory-motor grounding of the more abstract concepts <ability> and <boss> * on the 
basic concepts <grasp> and <head>. This information about * conceptual development 
is of interest for historical semanticists and * cognitive scientists in search of linguistic 
evidence for embodiment. 

However, reconstructing the history of a word, i.e. regressing its sound chain back 
to earlier forms, leads to a sound chain which is no less arbitrary with respect to the 
concept it designates than the word itself. Tracing back the evolution of French chef, we 
obtain the Latin word caput. Its sound chain does not tell us anything about its meaning 


which is something we have to investigate at the same time. 


2.2 Traces of embodiment in Chinese character forms 


As we have seen, reconstructing the form of a linguistic sign does not automatically 


provide knowledge about its meaning. This is different with the Chinese writing system. 


1 Our anonymous reviewer points out that the -ut ending does contain information about gender, declension 
or number, and thus provides semantic content. However, this does not alter our argument because -ut, 
as a linguistic sign, is as arbitrarily linked to its meaning as cap-. 
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Chinese characters consist of 1) the character meaning, 2) the character reading, i.e. a 
sound chain, and 3) the (written) character form. Reconstructing the evolution of the 
character form does not lead us to a collection of brush strokes related arbitrarily to 
any kind of concept, but to an iconic image character, to a representation of the concept 


originally designated by the form. 


` N 


b. NA 
ka 
<head> <chief) 


Fig. 3: Development of the Chinese character forms for «chief, first and «fish». 


Consider the Chinese character forms for the concepts «chief, first» and <fish> (shou 


and yu #4, see Figure 3). Tracing back their evolution, we obtain less abstract images 


and end up with the source concept of <chief> which is <head> and for <fish> which is, 
not surprisingly, <fish>. The abstract concept «chief, first» is grounded on the physical, 
bodily concept <head> whereas <fish> is not grounded on another basic concept as it is, 
in itself, a concept with physical, visible and touchable instantiations which are directly 
perceivable by sensory-motor means. 

Thus, the successful reconstruction of the Chinese character form directly provides 
the concept associated with it. Of course, we do not deny that even the interpretation of 


the underlying image is subject to a certain arbitrariness. In the case of Chinese shou 


, for example, it cannot be completely ruled out that the underlying image depicts 


something else than a head; and even if we admit that it shows a head the question arises 
as to what kind of head it is. However, because of their form representing character, 
these signs are less open to interpretation than are non-onomatopoeic sound-based 


signs: assuming that we do not have any additional information, an icon provides more 
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clues than a sound chain. This makes the Chinese writing system attractive for the 


study of embodiment. 


3 Chinese characters 


The Chinese writing system (CWS), as we know it today, is famous for its structural 
properties reflected by a complicated interaction of phonetic and semantic elements.” 
Since the Chinese characters can be divided into elements carrying phonetic as well as 
semantic functions, it is sometimes called a ‘semanto-phonetic writing system’ (yiyin 


wéenzi Be MF, cf. in Zhou 1998: 60), yet this characterization exaggerates the actual 


power of Chinese characters to display phonetic information in a transparent way: Most 
of the “phonetic” characteristics of the CWS are relics of the processes of character 
formation which, as they took place asynchronously, were always characterized by 
a complex interaction between the Chinese language spoken at different times of its 
history, the sociocultural background of those people who created the characters, and 


general patterns of reasoning and conceptualization. 


3.1 General characteristics of the Chinese writing system 


From a phonetic perspective, the CWS can be characterized as a syllabic writing system, 
since every character represents a syllable of the Chinese language. From a semantic 
perspective, on the other hand, it is a morphemic writing system, since the majority of all 
characters represents a minimal semantically meaningful unit of the Chinese language. 
In contrast to the dichotomic structure of alphabet systems, a Chinese character there- 
fore has a trichotomic structure, since it can be characterized by its form, its meaning, 
and its reading (List 2009). Thus, the Chinese character cdi 3K <to pluck» is defined by its 
written form JÆ, its meaning ‘to pluck’, and its reading [ts*ai?!*] (see Figure 4). Given 
this specific structure, we prefer the term morpheme-syllabic writing system (Chao 1968: 
102) over the above-mentioned term semanto-phonetic writing system, since this term 
more closely reflects the concrete units of the semantic and the phonetic domain that 


are referred to by a Chinese character. 


2 The use of the term “phonetic” follows the terminology that is used in the mainstream discussions on the 
topic. Our anonymous reviewer, however, is surely right in stating that it is rather “morphonological” than 
strict “phonetic identification” we are dealing with here. 
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Fig. 4: The trichotomic structure of Chinese characters. 


3.2 External and internal structure of Chinese characters 


An important aspect of Chinese character forms is their two-fold structure: Character 
forms can be analysed with respect to their external and their internal structure (List 
2008: 45 f.). Here, external structure refers to the formal aspects of the way the forms are 
built, i.e. the number, the order, and the direction of strokes. Internal structure refers 
to the motivation underlying the creation of the forms. While an analysis with respect 
to the external structure is strictly synchronic, an analysis of the internal structure is 


always done with respect to the diachronic dimension of a character. 


Vz. 


As an example, consider again the character cdi X <to pluck» (see Figure 5, middle). 
Based on its external structure one can divide the form into a sequence of eight different 
strokes (see Figure 5, left). The internal structure, on the other hand, can only be 
understood when going back in time and looking at the oracle bone version of the 
form, which dates back to around 1000 BC (see Figure 5, right). Here, one can see a 
hand which plucks some kind of fruit from a plant.’ Judging from the old version of the 
character form alone, the pictographic motivation might not be too obvious. But both 
the picture for <hand> and the picture for <fruits on a plant» are reflected in other old 
character forms as well, so there can be little doubt that the original motivation for the 


creation of the character form was to depict the process of grasping. 


3.3 Basic types of Chinese character formation 


By now, it should have become clear that — in contrast to many alphabetic systems — 
the formation of the Chinese character forms was not accomplished ad hoc, but instead 


took a certain amount of time, whereby many character forms were created during 


3 This is, of course, an overstatement, since we cannot see an action on a static picture, but have to infer 
the action from what we see. 
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7 —> 


VY7 


yo y” 


—a 


A 


Fig. 5: Chinese character form (middle) with its internal (right) and its external (left) structure. 


different time periods. The way new character forms were derived remained, however, 
rather stable during the history of the CWS. 

Based on the internal structure of the form, one can roughly distinguish three dif- 
ferent types of character forms: (1) semantic characters, i. e. characters whose formation 
was only semantically motivated, (2) phonetic characters, i.e. characters whose forma- 
tion was purely phonetically motivated, and (3) semanto-phonetic characters, i.e. char- 
acter forms whose formation was both semantically and phonetically motivated.’ As 
an example for the first formation type, consider, again, the character căi 3€ <to pluck». 
As was shown in the preceding paragraph, its form was originally a pictogram of a 
hand grasping some kind of fruit. Therefore, the motivation was purely semantic. The 
original form never provided any hint regarding the pronunciation of the word which 
it was supposed to refer to.” As an example for the second formation type, consider 
the character kù ff <cool>. This is a recent borrowing from English, pronounced as 
[ku°!] in Chinese, and the Chinese reflection of the word cool in the modern sense of 
being Cowboy-like and calm. Since the Chinese originally did not have a written rep- 
resentation for this loan word, they chose to use another character with an identical 
reading in order to reflect this specific word, resulting in a pure phonetic motivation for 
this specific use of the character.® As an example for the third formation type, which 


combines phonetic and semantic motivation, consider the same character kù fff with its 


original meaning <cruel>. Its form can be divided into the two elements you FY «bottle 


with liquid> and gdo $ <to tell», where the first probably serves as a semantic trigger 


for the original meaning of the word (“ripe”), while the second has a phonetic function, 


This is a very rough classification of Chinese characters, for a more refined classification, see, e. g., List 
(2008). 


wo 


At least we don’t have positive evidence for a phonetic function. 


This is a bit of an oversimplification, since in China the selection of characters to represent words that have 
so far no written representation is always driven by certain semantic considerations. 
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giving a hint to the pronunciation of the word (cf. Old Chinese *k'uk for ¢ vs. *k**uk 
for Ri). 

Based on this rough distinction between the three different types of character forms, 
one type of primary and two types of secondary character formation can be distin- 
guished. Primary character formation was often pictographic or ideographic. Secondary 
character formation, i.e. the formation of character forms based on already existing 
ones, was either based on phonetic borrowing or on semantic reinforcement.’ As an ex- 
ample, consider the character xidng § <elephant>. The formation type of its character 
form is primary, since it originally was semantically motivated, as a pictogram of an 
elephant, and one can therefore display the relations between meaning, reading and 
form of the character as illustrated in Figure 6 (left). Yet, already very early on, the 
Chinese used this character form not only for <elephant>, but also for <image>, which 
was pronounced in the same way as the word for elephant. Lacking a character form for 
such an abstract concept, they simply took the Chinese character form for <elephant>, 
and assigned it a different meaning. Therefore, the second meaning of the form § is 
purely phonetically motivated, and a new character was formed by means of borrow- 
ing. The relation between reading, form, and meaning can be displayed as illustrated in 
Figure 6 (middle). In even later times, the Chinese apparently did not feel quite com- 
fortable with having two meanings expressed by a single character form, and so they 
created a new character for <image>. This was done by adding a semantic element to the 
character form, which would distinguish <elephant> from <image>. Taking the form of 
the character rén <human> as an additional semantic element, a new character was built 
by means of semantic reinforcement. In contrast to the previous character forms, the 
new form has a double reference to both the reading and the meaning of the character, 


as illustrated in Figure 6 (right). 


4 Frames 


In cognitive sciences, the term frame is used for several kinds of meaning representa- 
tions of situations or objects. What all approaches have in common is that concepts are 
not considered as atomic units, but rather as highly structured entities. Barsalou (1992) 


develops his frame theory in contrast to meaning representations by feature lists, as 


7 Old Chinese readings follow Baxter & Sagart (2011). 


8 This is a very rough description of the basic types of Chinese character formation. For a more detailed 
account on Chinese character formation, see especially Qiú (1989). 
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Fig. 6: Basic types of character formation. 


they have been used in early cognitive semantics. Barsalou passes criticism on decom- 
posing concepts in unordered samples of features because “people do not store repre- 
sentational components independently of one another” (Barsalou 1992: 27). Instead, 
Barsalou points to evidence from several experiments that human cognition is based 
on attribute-value structures: The attributes describe general properties or dimensions 
of the object or category being represented, and the values are specifications of the 
attributes. From this point of view, the values correspond to features in feature lists, 
while the attributes represent the relations between these features and the represented 
object or situation. According to Barsalou, frames are recursive in that values and at- 
tributes are represented in further frames. Thus, it is almost impossible to reconstruct a 
“complete” frame. Rather, we will always refer to partial frames in the following, i. e. 
we will only point out those attributes that are currently relevant. 

Petersen (2007) uses directed graphs to model frames in the sense of Barsalou. In 
frame graphs, the arcs correspond to attributes and the nodes correspond to values (see 
Figure 7 for an example). The central node of the frame is marked by a double border. 
It designates the object or category being represented in the frame. Mathematically, 
attributes correspond to partial functions mapping values to values. As a consequence, 
attributes are right-unique, i.e. every attribute is specified by exactly one value. Be- 
cause of their right-uniqueness, attributes are predestined to be named with functional 
nouns in the sense of Lébner (2011) who distinguishes four basic types of nouns, de- 
pending on two binary features: relationality and uniqueness. Functional nouns are 
inherently unique and inherently relational, because their reference to a possessum is 
uniquely given once a possessor argument is saturated. Typical examples are nouns like 
mother or nose that identify their referent uniquely according to a possessor: a mother is 


always a mother of someone, and everyone has exactly one [biological] mother. Anal- 
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ogous statements are the case for the noun nose. Due to their inherent relationality 


functional nouns mostly occur in possessive constructions (cf. Lobner 2011: 14-18). 


Fig. 7: car frame as a directed graph. 


Lobner (2005) argues that functional nouns are verbalizations of attributes in frames 
such that concepts can be decomposed in terms of functional nouns. On this basis, we 
are able to identify the range of values an attribute can take. Building on Guarino 
(1992), we distinguish between the relational and the denotational interpretation of 
functional nouns. The relational interpretation refers to the relation that links the 
possessor somehow to the possessum. The denotational interpretation, however, is 
the referent to a certain possessum according to a given possessor. In mathematical 
terms, relational nouns are functions, where the relational interpretation corresponds 
to the mapping rule of the function and the denotational interpretation to the value the 
function takes according to a given argument. For instance, the relational interpretation 
of the concept mother in the NP Paul’s mother is the mapping rule “x is mother of y”, 
while the denotational interpretation is the referent of the NP. 

Due to their twofold interpretation, functional nouns are able to designate attributes 
as well as their values: attributes correspond to the relational interpretation of func- 
tional nouns and values to their denotational interpretation. For instance, the func- 
tional noun motor describes the attribute <«motor> in Figure 4 as “value x is the motor 
of the object y? while its denotational interpretation makes it possible to refer to the 
motor of the object itself. Thus, the values of an attribute have to be hyponyms of the 
denotational interpretation of the functional noun with which the attribute is named. 
This interpretation of attributes is in line with Barsalou who postulates that “[v]alues 
are subordinate concepts of an attribute” (Barsalou 1992: 31). A special case is attributes 


in verb frames that contain information about theta roles. Their range is determined by 
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selectional restrictions of the verb. We will mark value ranges in verb frames by naming 


the range on top of the value node (see Figure 8 for an example). 


creature object 


Fig. 8: Frame of the verb to hit. 


5 A frame model of character formation and concept evolution 


Since we assume that frames are the general format of human cognition, frame theory 
offers a tool to describe stages in concept evolution that are reflected in Chinese char- 
acter formation. In the following, we discuss three examples which illustrate how the 
sensory-motor grounding of human conceptualization is reflected in the formation of 
new Chinese characters. 

The first example illustrates the development of the character cdi 3 <vegetable>. 


Originally, there was no specific character for this concept, and therefore the character 


RYT 


cdi X <to pluck> was used to designate the concept. The problematic polysemy was only 
later resolved, and the character form was modified by adding the form of the character 
cdo <grass> on top. The frame of the «plucking action> contains a theme argument 
which takes a kind of plant as its value. On the linguistic surface, <to pluck» could be 
expressed by the word [*m-s'ro?], which is the way the word was pronounced around 
600 BC (Baxter & Sagart 2011). Since a vegetable is something that is typically plucked 
when it is ripe, it is a possible value for the theme argument. Chinese word formation 
around 600 BC allowed the derivation of verbs by prefixation and suffixation (Sagart 
1999). One common process involved the suffix [*-s] which provokes a nominalization 
of verbs (Sagart & Baxter 2011): adding [*-s] to [*m-s‘ro?] yields the word [*m-s‘ra?-s] 
which has the meaning «plucked (things)>. 

Over time, the meaning <plucked (things)» developed into the more specific meaning 
<vegetable>. The metonymical relationship between <to pluck» and <vegetable> and the 
formal relationship between the character reading associated with <to pluck» and the 


one associated with <vegetable> resulted in the use of the same form for <to pluck» 
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Fig. 9: Frames for «to pluck a vegetable and «grass». 


Fig. 10: Creation of a new character for the concept «vegetable». 


and <vegetable> (see Figure 9). Problematic polysemies, e.g. polysemies concerning 
concepts which are part of the same frame, tend to be resolved by the speakers (Blank 
1997: 357). To distinguish the concepts on the linguistic surface, a new form for the 
concept <vegetable> was created (see Figure 10). The concepts <vegetable> and «grass» 
are instantiations of the class <plant>. To solve the polysemy, the form for <grass» 
is added to the form for <pluck>. Thus, a character form for <vegetable> is created 
by grounding the concept on the metonymically related motor action <to pluck» and 
subsequently, the ambiguity of the character form for <pluck> is resolved. 

The second example illustrates the development of the form of the character qu % 
<to marry (a woman)» which is built as a combination of qù FX <to grasp» and nù {g 
<woman> (see Figure 11). The systematic correlates between the symbolic, i.e. abstract, 
act of marriage and the sensory motor experiences accompanying it, i.e. taking the 
bride to another place, as opposed to jid Į% «(leaving the family) to marry (a man)», 


result in the grounding of the symbolic act on the bodily actions. This is reflected in 
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the combination of the characters for qù FY <to grasp» and nii & «woman» to a new 
character which stands for <to marry». 

The frame of a typical grasping action contains the theme argument which typically 
has objects as values. The form of its character qu HX is an abstraction of a picture show- 
ing a hand grasping an ear. The pronunciation sounded approximately like [*ts"0?] 
(Baxter and Sagart 1999).° The theme argument allows many kinds of values, for in- 
stance women. The concept <woman> is represented by nù 4, a form which originally 
depicted a person sitting with the legs to the side. When the class of the theme argu- 
ment is <woman>, the whole frame represents the bodily action <to grasp a woman». 
The concept <to grasp a woman> is more specific than the non-saturated concept <to 
grasp», i.e. the upper-type concept of the theme attribute is substituted by a subsumed 
concept of the original concept, so that the range of the attribute is reduced (see Figure 
12). 

The lexicalization of this new, specialized meaning resulted in a situation where the 
reading [*ts"o?] and the associated form HY had two taxonomically related meanings. 
This problematic polysemy was resolved by merging the characters qù HX <to grasp» and 
nu Y <womam to create the new form qù Œ which stands for the concept <to marry 
(a woman)», an abstract concept grounded on the sensory motor concept <to grasp a 
woman» (see Figure 13). 

The third example illustrates the creation of the character xidng %8 <to think» which 
- judging from its derivation as a compound of the characters xiang #1 <to observe» 
and xīn Ò <heart/mind> — can be metaphorically understood as <to observe with one’s 
heart/mind>. This means again that an abstract concept is put down to a sensory 
motor concept which results directly from perceptual experience. The metaphorical 
process consists of a modification of the attribute-value structure of the concept <to 
observe> — which typically takes as instrument the concept <eye> (see Figure 14) — as the 
instrumental argument is saturated by the concept <heart>. As the argument saturation 
violates the original concept structure, no literal understanding is possible, so that the 
resulting concept is necessarily abstract. 

In the abstract concept, <heart/mind> figures as the value of the instrumental ar- 
gument. The reading that represented the concept <observe with one’s heart/mind>, 
i.e. <to think>, was derived from the pronunciation of the more general concept <to 


observe>: [*sars] was changed to [*sans-? > “san-?], meaning <to think» (Schuessler 


° As the anonymous reviewer pointed out, this is the practice of “cutting off the ears of an enemy and 
hanging them on a ritual girdle as a trophy, later called guó Hi?” 
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Fig. 11: Frames of «to grasp» and «woman». 


Fig. 12: Frame of the more specific concept «to grasp a woman». 


Fig. 13: Creation of a new character for «to marry (a woman)». 
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2007: 46 f.). The polysemy of the form which now stood for <to observe» and <to think» 
was disambiguated by integrating the character form for xin Ò <heart/mind> into the 


character form for xidng #1 <to observe» (see Figure 15). 


Fig. 14: Frame of «to see». 


Fig. 15: Creation of a character for «to think». 


6 Summary 


The processes of Chinese character formation reflect different states in concept develop- 
ment. They are well documented throughout the history of Chinese. Thus, the Chinese 
language offers rich possibilities to study concept evolution. Frame theory offers a tool 


for decomposing these different states in concept evolution in a cognitively adequate 
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way. Therefore, a frame approach may shed new light on concept development by 
analysing the interaction between writing, speaking, and meaning. In this paper, we 
demonstrated how frames can be used to model and investigate such different instances 
of concept evolution as metonymy, argument saturation, and metaphora. At the current 
state, our work remains exploratory, yet we are confident that the method provides a 


promising starting point for future research. 
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Abstract 

This article attempts to relate to literature insights on the role of sensory and motor 
processes as essential constituents of cognition. It concentrates on the representation 
of emotion. The starting-point of the investigation is the fact that the representation 
of emotion in literature is - analogous to emotion in actual life - essentially consti- 
tuted by motion and other physical or physiological manifestations. The investigation 
is supported by cognitive research and by neuroscientic research concerning the interde- 
pendence of emotion and motion. Evidence is adduced that emotional experiences are 
in a great quantity of literary texts represented as cognitive experiences with a strong 
participation of kinesthetic activities of the body. 

Keywords: embodied cognition, emotion, motion, facial feedback, body metaphors 


1 Introduction 


Cognitive science has recently had such a strong impact on literary studies that one 
can speak of a cognitive revolution in the scholarly treatment of literature (Stockwell, 
2002). The new concept of embodied cognition or grounded cognition, which accords 
the body a central role in shaping the mind (Wilson 2002, Barsalou: 2008, 2010), has, 
however, not yet found reverberations in literary studies. Embodied cognition is an 
extraordinarily wide concept. It means that the areas in the brain which activate the 
body and those which are involved in processing reason and linguistic meaning work 
interdependently. (Wilson, 2002, Mahon/Caramazza, 2008) In fact, an embodied theory 
of meaning seems to take shape which relates the meaning of words and sentences to 
bodily action. (See for instance Glenberg and Paschak, 2002) 

The present article’ is a new departure in that it attempts to relate to literature in- 


sights on the role of sensory and motor processes as essential constituents of cognition. 


1 Tam indebted for valuable help and inspiration to my Jena colleagues Doreen Triebel, Dirk Vanderbeke and 
Oliver Bock. For whatever may be open to criticism in this study they are not responsible. 
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(Pezullo et al., 2011) The notion of sensory-motor processes is comprehensive, since it 
includes, on the one hand, gestural, facial aspects and other movements of the body, 
and, on the other hand, phenomena such as smelling, tasting and hearing. It is not 
always possible to treat these processes separately. The following contribution con- 
centrates on emotion, which has been a quite prominent topic in cognitive studies for 
the last two decades. (Hogan, 2010, p. 237) It attempts to show that writers of fictional 
literature seem to have had a notion of embodied cognition long before the term was 
created. A basis for applying sensory-motor concepts to literature as a product of the 
imagination is that “imagining and doing use a shared neural substrate” (Gallese/Lakoff, 
2005, p. 456), i.e. when one imagines seeing something, the same parts of the brain 
are used as when one actually sees. Or when we imagine that we are moving, some 
of the same parts of the brain are used as when we actually move. Movement, per- 
ceived on a screen or represented in a text, may cause cognitive processing analogous 
to real-life movement. This notion can even be extended to the use of metaphors. It 
has been shown that not only actual and imagined physical activities, but also physi- 
cal events evoked in metaphors (Lakoff/Johnson, 1999, Schrott, Jacobs, 211) and idioms 
(Boulenger, Hauk, Pulvermiiller, 2008) can be related to the motor cortex. The whole 
development in philosophy and psychology from Descartes’ dualism of res cogitans and 
res extensa to modern views of the interconnectedness of body and mind in Antonio 
Damasio’s reference to the “embodied mind” (Damasio, 1984, 1999) or Matthew Rat- 
cliffe’s understanding of feelings as “bodily states” (Ratcliffe, 2008), which provides a 
cultural context for recent advances in cognitive science, cannot be treated within the 
frame of this article. If certain developments in the philosophy and psychology of feel- 
ing lead to concepts which come close to what is called embodied cognition in cognitive 
science, this contribution will discuss literature as another significant area of cultural 


production which can be related to the context of embodied cognition. 


2 Motion and Emotion 


The starting-point of the following investigation is the fact that the representation of 
emotion in literature is - analogous to emotion in actual life — essentially constituted 
by motion and other physical or physiological manifestations. The etymology of the 
word is already significant. Emotion is derived from the French verb émouvoir, which is 
based on Latin emovere — in the Latin verb the preposition “e“ means "without"/”out of” 


and movere means "move“. The connection of the phenomenon of emotion with motion 
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is ubiquitous in general language use, as the following randomly chosen metaphorical 
words and phrases show: “Gefiihlsbewegung”, “revulsion of feeling”, “he fell in love’, 
“he fell into a depression’, “his heart started pounding when he saw her”. 

Before starting our investigation, at least one piece of evidence from neuroscience 
for relating emotion and motion will be adduced to support our procedure by cognitive 
research. A team of researchers from Cambridge and Berlin made experiments on 
18 persons, using magnetic resonance imaging to compare brain activation evoked by 
emotion words to that evoked by face- and arm-related action words. (Moseley, 2011) 
The result was that emotion words evoked activity in the motor brain systems. That 
means that emotion words activate the motor system in a way comparable to action 
words. So the attempt to correlate emotion and motion is corroborated by findings 
of neuroscience. For what holds true for actual reality can also be taken granted for 
imagined reality. 

If we look at the representation of emotion in literature, we notice that it is, as 
is the case in actual reality, essentially constituted by motion and other physical or 
physiological manifestations. This holds true for visual art, too, as Edvard Munch’s 


painting “The Scream” shows. 


Edvard Munch, The Scream 1893, National Gallery, Oslo 
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The hands of the screaming person are elongated and pressed tightly to the head. The 
mouth is wide open; so are the eyes, empty and unfocussed. The person is not actually 
in motion. The painting shows a face freezed with horror, which is the result of a 
movement, which is also reflected in the landscape. Munch’s work is an outstanding 
example of the visual representation of embodied emotion. However, we must be aware 
of the fact that the painting is a work of art, of great art at that, and not what cognitivists 
would normally study. It transcends what is accessible to scientific experiments. But 
it is based on a principle on which many modern scientific theories of emotion are 


founded, namely that emotion is largely manifested in the body. 


3 Facial Feedback in Narrative Prose 


Textual analysis will begin by examining the representation of facial and other physical 
activity in narrative prose in relation to emotion. Cognitivists like Adelman and Zajonc 
interested in the phenomenon of emotion have emphasized “the role of emotional facial 
action in the subjective experience of emotion” (Adelman, 1989, p. 249). Adelman avoids 
‘the convention of referring to emotional facial action as “expression” since this term 
imposes an a priori theory, implying that emotional facial action (facial efference) has its 
major role in the manifestation of internal states’ (Adelman, 1989, pp. 249-250). While 
I agree with Adelman and Zajonc in avoiding subjectivist and expressionist theories, I 
take the liberty of using, at times, the term “facial (or bodily) manifestation” rather than 
“facial efference” (“efferent”, ‘carrying or conducting outwards from a part or an organ 
of the body, esp. from the brain or spinal cord’). 

Before looking at the representation of facial manifestation in literature, attention 
will be drawn to a famous experiment carried out by Fritz Strack (1967) which is rel- 
evant to our argument. This is the so-called pen experiment which proves the facial 
feedback hypothesis from a scientific perspective. Subjects had to hold a pen in their 
mouth in ways that either inhibited or facilitated the muscles usually associated with 
smiling. Holding a pen with the teeth only was considered a facilitating condition since 
it involved the muscles active in smiling; holding a pen with the lips only was consid- 
ered an inhibiting condition, since it did not involve or, rather, inhibited the muscles 
associated with smiling. Subjects who had the pen between their teeth showed more 
intense humour responses, when cartoons were presented to them, than subjects who 
had the pen between the lips. The question of the quality of the response — affective 


or cognitive — cannot be discussed here. (See Dem 1967.) 
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Narrative texts in which a great amount of bodily and facial activity is represented are 
novels of the Thirties and the Forties of the last century (Dashiell Hammett, Raymond 
Chandler, Ernest Hemingway etc.), which I some time ago termed the behavioristic 
novel (Müller 1981). In this kind of narration the representation tends to leave out 
internal description and concentrate on outward physical manifestation, i.e. on what 
cognitivists call facial and bodily feedback. Here is an example from Hammett’s novel 
The Glass Key (1931): 


(1) When he [Ned Beaumont] rose from the telephone he was smiling with pale lips. 
His eyes were shiny and reckless. His hands shook a little. (Hammett, 1975, p. 127) 


That physical activity suggests emotion in these sentences is indicated by the ref- 
erence to the character’s smile and by his “reckless” eyes. Yet what he really thinks 
remains unstated. It may be a challenge to the reader to reconstruct his undisclosed 
thoughts along the lines of the theory of mind. (Zunshine, 2006) Our capacity for mind 
reading can be in demand if we are confronted with real people just as with literary 
figures. We may encounter analogous situations in real life and in the fictional world 
of the screen or the book. When a person’s gestural and facial activity goes without 
words we have to perform a cognitive achievement. An example from one of Raymond 
Chandler’s novels, which are told by the I-narrator Philip Marlowe, represents the body 
action of a person, who has committed body-stripping, from the narrator’s perspec- 
tive. The following quotations have been collected from the scene in question, in which 


pressure is put on the character by the narrator: 


(2) Tiny beads of sweat showed on Flack’s lip above his little moustache. - He hunched 
down in a chair and stared at the corner of the desk. After a long time, he sighed. — 
His eyes were small and thoughtful. His tongue pushed out over his lower lip. — 
I stopped and watched the faint glisten of moisture forming on his forehead. He 
swallowed hard. His eyes were sick. - He just sat there and stared at me with 
his nasty little eyes half closed and his nasty little moustache shining. One of his 
hands twitched on the desk, an aimless movement. (Chandler, The Little Sister, 
1975, Chapter 11) 


The emotions of the character remain unspecified. The representation is restricted to 
outward physical manifestation. Yet the effect on the reader is to perceive, in a cognitive 
act, the character as extremely uncomfortable. Nor is the narrator’s response explicitly 
communicated. His attitude of distaste and contempt is only suggested by the dinginess 


and meanness of the described person and by the adjective “nasty” which is applied to 
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the eyes and moustache. The representation of his own facial activity as a response is 
not possible for the narrator, for the text is written in the form of I-narration. Yet it is 
astonishing to what extent the narrator applies description of physical details also to 


his own body, as the two following examples show: 


(3) Igrinned suddenly, bent over and quickly and with the grin still on my face, out of 
place as it was, pulled off Dr. Hambleton’s toupee [...] (Chandler, 1975, The Little 
Sister, Chapter 9) 


(4) Then I put it [the telephone] down very slowly and looked at the hand that held it. 
It was half open and clenched stiff, as it was holding the instrument. (Chandler, 
1975, The Lady in the Lake, Chapter 28) 


In the first quotation the narrator describes even a facial manifestation - grinning, 
the grin on his face -, which he cannot see. For the explication of more complicated 
examples we have to refer to the concepts of mirror neurons and theory of mind. To do 
so in a graphic way we will first look at a painting which refers to a complex situation of 
interfacial response, Nicolas Poussin, “Landscape with a Man Killed by a Snake“ (1648). 
The reason for the shift of our argument to another art medium is that an interfacial 


phenomenon is, in this ocular form, easier to grasp and interpret. 


Nicolas Poussin, Landscape with a Man Killed by a Snake 1648, The National Gallery, London 


This painting, which has been characterized as a “study in fear” and has been used 


as a cover image for Richard Wollheim’s book On the Emotions (1999), reproduces an 
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emotionally disturbing scene in an apparently serene Arcadian landscape. At the bot- 
tom of the left-hand side a man is being killed by a snake. From the right side a young 
man with terrified, pain-distorted face, turned to the spectator, is fleeing from the place 
of the accident. The emotional stress represented in the face is unfortunately not to be 
seen clearly enough in the reproduction of the painting, which is exhibited in in the Na- 
tional Gallery in London. In the middle of the painting there is a woman, who from her 
position cannot perceive the place of the accident. But her face and posture assumes the 
same pained appearance as the man’s. Her agitation is also shown in her wild gestures 
and her forward-bending posture. She may be screaming. If she were a real person, 
we would have to say that her motor cortex has been activated most strongly by the 
running man she is seeing. 

Poussin leaves the accident itself almost in the dark. The painting’s emphasis is on 
the emotions reflected in the faces of the two other figures. The fact that the woman, 
though not knowing the reason for the man’s fear, shows the same physical evidence 
of fear as the man, can be, from a contemporary scientific vantage-point, explained by 
the theory of mirror neurons. This theory, which I referred to earlier in this paper, is 
based on the empirically gained insight — first derived from the observation of mon- 
keys — that the same parts of the brain are active when a person performs an action as 
when the person sees another individual performing the same action. (Rizzolatti: 1999, 
2526-2528) This phenomenon can be explained in terms of the facial feedback theory. It 
would also be possible to explain Poussin’s scene of facial interaction in terms of the 
theory-of-mind concept.’ This would mean that the woman in the painting, looking at 
the frightened face of the man, forms an idea of what he feels and feels the same by 
way of empathy, as her facial aspect indicates. In fact, in a recent article on face-to- 
face interactions Martin Schulte-Rither et al. (2007) applied both the Mirror Neuron 
Theory and the Theory of Mind to face-to-face interaction. I personally would prefer 
to describe the scene in the painting under discussion as face-to-face interaction with 
embodied cognition on the part of the woman. She empathizes with the man on account 
of the pain manifested in his face, the empathy causing motor activity. That mind read- 
ing and cognition belong together is stated by Alan Richardson’s following quotation: 
“What’s termed our ‘theory of mind’ [...] would be greatly impoverished if we did not 
have a reasonably reliable, and therefor largely unconscious, cognitive mechanism for 
gauging the emotions and intentions of others through reading their faces.” (Richardson 
2010: 65) 


2 This was suggested to me by my colleague Dirk Vanderbeke. 
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Let us now look at two passages in Raymond Chandler’s novel The Little Sister, where 


the protagonist realizes his emotional state only by looking at himself in a mirror: 


(5) Passing the open door of the wash cabinet I saw a stiff excited face in the glass. 
I went over to the wash-basin and washed my hands and face. I sloshed cold water 
on my face and dried it off hard with the towel and looked at it in the mirror. 
“You drove off a cliff all right; I said. 

(Chandler, 1975, The Little Sister, Chapter 23) 


(6) Igot up and went to the built-in wardrobe and looked at my face in the flawed 
mirror. It had a strained look. Td been living too fast. (Chandler, 1975, The Little 
Sister, p. 133) 


These examples may be a far cry away from Lacan’s theory of the mirror phase in 
which the child for the first time succeeds in recognizing and identifying him/herself as 
a complete self in front of a mirror, but the look of the protagonist at himself in a mirror 
in the novel by Chandler also has a cognitive function. In the passages quoted the 
narrator finds out something about himself by looking at himself. In the first example 
the narrator even talks to his face. It would be problematic to apply the mirror neuron 
concept to self-perception in the two passages, for one would have to assume self- 
division in the observer, in which one part of the self observes the other part, although 
a dual self is indicated in the first quotation, which refers to “a stiff excited face” and not 
to “my stiff excited face”. The protagonist even addresses his face. The two examples 
could be explained as special instances of the theory of mind applied to a character 
finding out something about his person by looking at himself in a mirror. However, 
more plausible would be the reference to Daryl Bem’s “self-perception theory”. This 
theory explains that people form new attitudes and beliefs, including those related to 
the self, from observing their own behavior. Bem (1967) maintains that people deduce 
their own internal states, like attitudes and emotions via the same processes by which 
they deduce the internal states and dispositions of others. Specifically, Bem assumes 
that people use their facial expressions as a source of information to infer their own 
attitudes. This is what happens in the two quotes in which Philip Marlowe looks at 
his own face in the mirror. Before the mirror the character comes to self-perception 
and partially also to conclusions concerning the state of his mind. Be that as it may, 
in the context of sensory motor concepts the novels by Hammett and Chandler provide 


abundant evidence of embodied cognition. 
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4 Emotion as Cognition I: An Example from Narrative Prose 


As for emotion, I am making a wide claim, namely to postulate that emotional expe- 
riences, whether actual or imaginary, are cognitive events, to which sensory motor pro- 
cesses contribute essentially. A similar position has been taken by Meyer-Sickendieck 
(2012) who regards the perception of moods as cognitive acts which have a profound ef- 
fect on the body. Physical manifestations like the body shaking, the heart beating faster, 
eyes being averted and facial expressions like smiling or tears are more than” physio- 
logical accompaniments”, as Oately states (Oately, 1994, p. 53, Oately, 1992. p. 20-21), 
but, looked at in the context of the present project, they are part and parcel of cogni- 
tion. In this respect the approach taken here differs from Raymond Gibbs’ important 
chapter on emotion (Gibbs, 2005, pp. 239-274), which focuses on consciousness rather 
than cognition. I can, however, not refer to empirical evidence for substantiating my 
assumptions concerning the representation of emotion in literature. Meyer-Sickendieck 
pursues such an empirical project, as he declares in the conclusion of his book, but his 
methodology does not seem to reach the precision of brain scientists. As a literary critic 
I have to rely on literary texts. Since poetic language does not differ radically from ev- 
eryday language and since poetic language frequently evinces linguistic features which 
are a heightened form of normal speech, my examples may perhaps be not without 
relevance for linguists and cognitive scientists. 

I will begin with an example of narrative prose, a passage from William Faulkner’s 


novel Light in August, which deals with the fate of a white African-American: 


(7) He turned into [the street] running and plunged up the sharp ascent, his heart 
hammering, and into the higher street. He stopped here, panting, glaring, his 
heart thudding as if it could not or would not yet believe that the air was now the 
cold hard air of white people. (Faulkner, 1971, p. 88) 


Here physical action emerges as emotion, be it fear or revulsion. While running 
from a district of blacks to a district of whites, the protagonist passes through different 
emotional states. As frantic as he may be, he is aware of what happens. The passage 
represents motion and emotion and cognition in an insoluble conjunction. Emotion is 
motion, as the pounding heart indicates. The passage can be regarded as an extreme 
literary example of embodied cognition, in which the sensory-motor component goes 


together with cognition. 
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5 Emotion as Cognition II: Examples from Romantic Poetry 


Romantic lyric poetry represents, on the whole, a strongly subjective and intimate form 
of discourse which with its orientation on the individual self tends to be at variance 
with the socially established systems of discourse which, according to Niklas Luhmann 
(Liebe als Passion, 1982), influence linguistic and literary representation. It may be 
objected that it makes no sense to approach this kind of poetry, which is to a large extent 
characterized by interiority, from the point of view of the sensory-motor concept. I will 
try and show that such an objection would not be justified. My analysis begins with 
a look at a notoriously emotional poem, which has a curious aspect that had puzzled 
me for a long time until I looked at it in terms of the sensory-motor concept. It is Percy 
B. Shelley’s “The Indian Serenade”. In this poem it is intense emotion which makes 
the lover “arise from dreams” of his beloved and forces him to her chamber-window. 
Emotion inevitably concurs with motion. The speaker declares that it is a “spirit in my 


feet” which leads him to her window: 


(8) Anda spirit in my feet 
Hath led me — who knows how? 
To thy chamber window, Sweet! (Shelley, 1970, p. 500) 


The puzzling phrase in this poem is “a spirit in my feet”. For a neuroscientist it 
may seem absurd or downright silly to locate a “spirit” in a foot. But it is interesting 
that Shelley, who could not know anything about neuroscience, felt the need for a 
physical source or agency which caused the action of his lover, a source interestingly 
different from the heart which would in the cardiocentric tradition (Niemeier 2011) be 
responsible for a lover’s action. The heart or the soul is, at least in this poem, not the 
seat of the feelings. Nowadays we would of course retrace the source of the lover’s 
motion in the poem to his brain. In want of any such concept the foot had to serve 
as a kind of substitute. The passage explicitly illustrates a coincidence of emotion and 
motion with cognitive implications. Cognition is involved in the self-observation and 
the self-description of the poems’ speaker. 

The following quotations from romantic poems provide evidence for the hypothesis 
that emotion tends to be represented as cognition and to be embodied in motion in the 


poetry of the age. Lines like - 
(9) Idie! I faint! I fail (“Indian Serenade”, Shelley, 1970, p. 580) 


(10) I pant, I sink, I tremble, I expire! (Epipsychidion, Shelley, 1970, p. 424) 
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(11) My heart aches, and a drowsy numbness pains 
My sense, as though of hemlock I had drunk, 
Or emptied some dull opiate to the drains 
One minute past and Lethe-wards had sunk: 
[...] (“Ode to a Nightingale”, 1-4, Keats, 1970, p. 207) 


- have traditionally been called subjective or self-expressive. The most important repre- 
sentative of a poetics of expression which accounts for such texts, emphasizing notions 
of subjectivity and expressiveness, is M. H. Abrams’ famous book The Mirror and the 
Lamp (1953). In the light of sensory-motor concepts such poems should rather be called 
self-diagnostic or self-reflexive. It is significant that in all these examples emotion co- 
incides with motion. Emotion manifests itself in physical terms or, more precisely, in 
motion. This is the case even in the lines from Keats’ ode, although the depressed state 
of having “sunk” down is described only on a metaphorical level. Poetry intensifies here 
what we have noted above with reference to everyday language, namely that emotional 
states are frequently expressed in physical terms, for example in words like “downcast” 
(German “niedergeschlagen”) or “spurred” (German “befliigelt”). Keats is, incidentally, 
one of the greatest diagnosticians in English poetry, which reflects his deep interest in 
medicine and new ideas in brain anatomy and neurophysiology. (Richardson 2010: 75) 
Here is another example. In “Ode on a Grecian Urn” the “happy” world depicted on the 


urn is described, which is far above “all breathing human passion” 


(12) That leaves a heart high-sorrowful and cloy’d, 
A burning forehead, and a parching tongue. (Keats, 1970, p. 210) 


The lines refer to an emotion of extreme sorrow, but the terms in which it is rep- 
resented are intensely physical or sensory, almost in the form of a medical diagnosis. 
Self-description is intensified to the point of self-diagnosis: the heart is sickened, the 
forehead burning, and the tongue dried out. The great amount of physical manifestation 
in these and many more cases is a testimony of cognition rather merely accompaniment 
of it. 

In order to point at the recipient’s side of an embodied understanding of emotion 
I will quote a passage from the poet A. E. Housman’s famous lecture The Name and 


Nature of Poetry in which he equates the emotional effect of poetry with a physical one: 


(13) Experience has taught me, when I am shaving of a morning, to keep watch over 
my thoughts, because, if a line of poetry strays into my memory, my skins bristles 


so that the razor ceases to act. This particular symptom is accompanied by a shiver 
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down the spine; there is another which consists in a constriction of the throat and 
a precipitation of water to the eyes: and there is a third which I can only describe 
by borrowing a phrase from one of Keats’s last letters, where he says, speaking of 
Fanny Brawne, ‘everything that reminds me of her goes through my like a spear’. 


The seat of this sensation is the pit of the stomach. (Housman, 1933, p. 47) 


This is an extreme example of the effect of poetry caused by emotions which react 
on the body in a multitude of ways from a bristling of the skin to the sense of being 
penetrated by a spear in the pit of the stomach. The importance of the physical side in 
the representation of emotions, which is emphasized in my argument, could be supple- 
mented by investigations of the production and the reception side of the poetic process, 


which is not possible within the frame of this article. 


6 Embodied Cognition in a Modern Poem 


The lyrical poems dealt with so far have been taken from romantic poetry. To give just a 
glimpse of embodied cognition, which continues to emerge, in varied forms, in later po- 
etry, at least one twentieth-century poem will be adduced. In this context the study by 
Burkhard Meyer-Sickendiek has to be referred again, Lyrisches Gespür. Vom geheimen 
Sensorium moderner Poesie. (2012). Meyer-Sickendiek is strongly interested in the lyric 
representation of fugitive moods which are barely felt out by a sensitive subject and he 
discusses the corporeality of perception. His theoretical approach, which is focused on 
mood (“Stimmung”) rather than feeling, is related to the New Phenomenology of Her- 
mann Schmitz. It provides the basis for extremely subtle analyses. Although it does not 
refer to sensory-motor concepts, it ties in with the present study, since it understands 
the apprehension of a mood (“das Erspiiren einer Stimmung”) as a cognitive act. The 


poem to be looked at here is by William Carlos Williams: 


(14) To a Poor Old Woman 


munching a plum on 
the street a paper bag 
of them in her hand 


They taste good to her 
They taste good 

to her. They taste 
good to her 
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You can see it by 

the way she gives herself 
to the one half 

sucked out in her hand 
Comforted by 

a solace of ripe plums 
seeming to fill the air 
They taste good to her 
(Williams, 1951, p. 99) 


In this poem there is a total focus on the woman who is eating plums with the great- 
est relish, even sucking the fruit from her hand. Her feelings of sensuous pleasure are 
denoted by the statement “They taste good to her”. Even the last stanza, which artic- 
ulates a kind of epiphany, is concentrated on smell and taste. In the second stanza it is 
the device of repetition with the shifting of the line ends within the repeated sentence, — 
an intricate counterpointing of syntax and meter - which has an iconic and intensify- 
ing effect. The shifting enjambment mimes the process of munching and savouring the 
plums. The notion of embodied feeling is here expressed by a distinct poetic technique — 
repetition. The last stanza conveys a sense of satisfaction which transcends the limits 
of the object beheld. Emotion words refer to sensuous contentedness ("Comforted / a 
solace of ripe plums") and an impression of the air being filled with the smell of plums 
("seeming to fill the air") is evoked. There is also a cognitive component in the woman’s 
pleasure. She obviously knows what she is doing and she enjoys what she is doing, as 
it is suggested by the repeated clause, “They taste good to her”. The poem is an interest- 
ing case in that in addition to the visual dimension it includes the senses of tasting and 
smelling. Its imagery is multisensory. (Starr, 2010) It is one of the purest examples of 


embodied feeling in poetry. 


7 Emotion Manifested in Kinetic Body Metaphors 


From a cognitive point of view there is hardly a difference between metaphor in general 
and in literary language. Lakoff and Johnson argue that human thought and speech are 
constructed metaphorically from the basic kinesthetic experiences of living in a body 
(Crane 2010: 104). In an illuminating experiment Boulanger, Hauk and Pulvermiiller 
(2009) could show that idioms - which contain action metaphors like “grasping the 


idea” — activate the motor cortex just as non-figurative expressions referring to action 
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do. Their conclusion is that “Motor systems of the brain, including motor and premotor 
cortex, and the motor cognitions they process appear to be central for understanding 
idioms.” (Boulanger, Hauk and Pulvermiiller 2008: 1913) Metaphors which refer to mo- 
tions of the body like 


(15) My blood freezes — I could vomit - He fell into a tumult of contradictory feelings - 


Grasping ideas requires some intelligence 


are not radically different from metaphors in poetry. The most important difference 
seems to be that poetic metaphors usually strive for the quality of novelty or originality. 
It can be said that metaphor is a supreme device of expressing emotion in poetry. It is 
in fact a catalyst of emotion. Also in this context the relation between motion and 
emotion, which is our topic, is particularly frequent, as the poems quoted above show. 
Further evidence is provided by the following examples from Gerard Manley Hopkins’ 


so-called terrible sonnets (Poems 65 and 67): 


(16) No worst, there is none. Pitched past pitch of grief, 
More pangs will, schooled at forepangs, wilder wring. (Hopkins, 1967, p. 100) 


(17) Selfyeast of spirit a dull dough sours. (Hopkins, 1967, p. 101) 


(18) Iam gall, I am heartburn. God’s most deep decree 
Bitter would have me taste: my taste was me; 
Bones built in me, flesh filled, blood brimmed the curse. (Hopkins, 1967, p. 101) 


In each of these instances extreme emotions are rendered in physical terms: in the 
first case spiritual pain manifests itself physically in terms of wringing pangs, in the 
second the self’s spiritual helplessness is expressed by the image of the leaven of the 
self, unable to raise a dough, and in the third metaphors of taste are used to express 
the emotional state of the self, which is, in the absence of God, thrown back on itself 
and has to taste itself. It should not be forgotten that these are written or rather printed 
words, condensed dust on the page, if you do not read them aloud, and yet there is 
an enormous sense of physicality to them;* the body as the testimony of emotions 
is powerfully present, for, as we have argued, the respective areas of the brain are 
activated regardless of their metaphorical character. At the same time Hopkins’ lines 


evince great self-awareness and self-perception on the part of the speaker. Again I 


3 I am aware that reading a poem silently also has a sensory quality. If this was not the case, sound effects 
and the rhythmical quality of the poem would remain unnoticed. The exploration of this phenomenon 
would be another challenge of cognitive research. 
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would not like to call such poetic discourse self-expression. On account of its strong 


cognitive quality I would prefer the term self-definition or self-diagnosis. 


8 A Note on Embodied Cognition in Literature and the 
Historical Aspect - Shakespeare 


The examples adduced for embodied cognition in the context of the representation of 
emotion have been taken from nineteenth and twentieth-century literature. The tex- 
tual corpus should, of course, be extended to earlier literature, and it should be asked 
whether historical developments can be identified in the treatment of the relation of 
emotion and motion in literature. A first impression gained during the research for this 
study is that a climax of embodied cognition and emotion is to be found in the literature 
of the romantic period. However, is evident that further historical research and an ex- 
pansion of the corpus are needed. Petrarch’s Canzoniere, for instance, one of the most 
important models for lyric production all over Europe, has explicit descriptions of feel- 
ings in physical terms such as the lover’s freezing and burning in his changing moods. 
It would certainly be fruitful to look at Petrarch and his tradition or at metaphysical 
poetry with respect to embodied feeling and cognition, but I would like to have just a 
brief look at Shakespeare, who in this, as in so many other aspects, proves to be quite 
modern. Darwin quotes him as one of the chief authorities on human expression. (Alan 
Richardson 2010: 71) There is no room to go into Mary Thomas Crane’s important study 
on conceptual metaphors in Shakespeare’s Brain. (2000) First, two lines from Hamlet’s 


first soliloquy are to be quoted: 


(19) O that this too sullied flesh would melt, 
Thaw, and resolve itself into a dew, 
(Shakespeare 2005, p. 113, Act I, Scene 2, ll. 129-30) 


Hamlet’s disillusionment with his family and his self-loathing are manifested in phys- 
ical terms, in his desire for his body to melt away. The protagonist does not in the first 
place refer to his feelings of depression, but his psycho-physical condition. Hamlet’s 
world-weariness manifests itself in a desire for the dissolution of his body. Another 
example is Macbeth, who contemplates murder to gain the crown, yet is so terrified by 


fear that the image of murder unsettles his bodily functions, 


(20) [...] why do I yield to that suggestion 
Whose horrid image doth unfix my hair, 
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And make my seated heart knock at my ribs, 
Against the use of nature? (Shakespeare, 1971, p. 21, Act I, Scene 3, ll. 134-37) 


Here again the image of the body’s life, with the hair standing on end and the heart 
knocking at the ribs, coincides with the protagonist’s feelings. In fact, the physiological 
event is not a mere symptom, but a manifestation of mental disorder. There is also a 
pronounced cognitive dimension in the passage, in the form of self-observation and self- 
diagnosis. Macbeth feels an emotion and simultaneously perceives it as a manifestation 
of the body. 

An examination of modernist poetry could lead to the result that there are poets like 
T.S. Eliot and Wallace Stevens who tend to write ‘disembodied ‘ poetry, i.e. poetry 
that is averse to embodied cognition, while others like Ezra Pound and William Carlos 
William are in favour of embodied cognition. As early as 1988 Max Nanny wrote an 
article on Ezra Pound as a “Right Brain Poet”. By way of analogy one could call T.S. 
Eliot a left brain poet. But such classifications should be treated with caution from a 
cognitive and historical point of view. As far as Eliot is concerned, his theory of the 


“dissociation of sensibility” could be discussed in terms of the cognitive approach. 


9 Conclusion 


It may be objected that the material treated in this study is rather diversified and dis- 
parate and that the evidence presented moves freely between genres and periods, but as 
an extenuating circumstance it can be pointed out that an innovative approach is tried 
out in this contribution, which opens new perspectives and is waiting to be substan- 
tiated in a more comprehensive and systematic procedure. Some results can at least be 
ascertained. Emotion and motion have turned out to be two sides of a coin. Motion 
is understood as the kinesthetic experience of the body, which comes into play with 
any emotional experience, no matter whether in real life or fiction, although the liter- 
ary artist has aesthetic strategies at his or her disposal which make possible intensified 
representation of emotion. Analyses have shown that literature is a veritable field for 
experimentation in matters of embodied cognition. Embodied cognition could be iden- 
tified in the representation of facial feedback, both in views from outside (external) and 
from inside (internal). The thesis that emotion is closely linked to physical manifesta- 
tion, that emotion is actually inseparable from motion has been confirmed in numerous 
examples from narrative prose and poetry. The application of the term “sensory-motor” 


to cognitive phenomena represented in literature proved to have certain advantages, 
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since it allows for the appreciation of different aspects of body activity involved in cog- 
nition (changes of the position of the body, movement of the limbs, facial expression, 
smelling, taste), which could be found in the instances of represented emotion exam- 
ined. It should be noteworthy and encouraging for cognitivists and neuroscientists 
advocating sensory-motor concepts that in literature there is massive evidence for their 
theories from times in which nobody as yet dreamed of neurons let alone sensory-motor 
concepts. The interdisciplinary benefit can be mutual, the literary scholar profiting from 
the mind and brain scientist exploring hitherto unknown dimensions of human reality, 
and the scientist learning that poets have all along known more about the mind’s con- 


struction than they would have believed possible. 


10 References 


Adelmann, P. K. and R. B. Zajonc (1989). Facial Efference and the Experience of Emotion. 
Annual Review of Psychology, 40, 249-280. 

Barsalou, L. W. (2008). Grounded cognition. Annual Review of Psychology, 59, 617-645. 

Barsalou, L. W. (2010). Grounded cognition: Past, present, and future. Topics in Cognitive 
Science, 2, 716-724. 

Boulanger, V., Hauk, O. & F. Pulvermiiller (2009). Grasping Ideas with the Motor System: 
Semantic Somatotopy in Idiom Comprehension. Cerebral Cortex 19, 1905-1914. 

Chandler, R. (1975). The Lady in the Lake. Harmondsworth: Penguin. 

Chandler, R (1975). The Little Sister. Harmondsworth: Pengiun. 

Crane, M. T. (2000). Shakespeare’s Brain: Reading with Cognitive Theory. Princeton: 
Princeton University Press. 

Crane, M. T. (2010). Analogy, Metaphor, and the New Science: Cognitive Science and 
Early Modern Epistemology. In: Zunshine (Ed.), 103-114. 


Damasio, A. (1994). Descartes’ Error. London: Penguin. 


Damasio, A. (1999). The Feeling of What Happens. Body and Emotion in the Making of 


Consciousness. New York, San Diego, London: Harcourt Brace. 
Faulkner, W. (1971). Light in August. Harmondsworth: Penguin. 


Gallese, V. & G. Lakoff (2005). The Brain’s Concepts: The Role of the Sensory-Motor 
System in Reason and Language. Cognitive Neuropsychology, 22, 455-479. 


Gibbs, R. W. J. (2005). Embodiment and Cognitive Science. Cambridge: Cambridge Uni- 


versity Press. 


79 


Wolfgang G. Miiller 


Glenberg, A. M. & M. P. Paschak (2002). Grounding language in action. Psychonomic 
Bulletin & Review, 9, 558-565. 

Hammett, D. (1975). The Glass Key. London and Sidney: Pan Books. 

Hogan, P. C. (2010). On Being Moved: Cognition and Emotion in Literature and Film. In 
Zunshine (Ed.), 237-256. 

Hopkins, G. M. (1967). The Poems of Gerard Manley Hopkins. London, New York, 
Toronto: Oxford University Press. 

Housman, A. E. (1933). The Name and Nature of Poetry. Cambridge: At the University 
Press. 

Keats, J. (1970). Poetical Works. London, Oxford, New York: Oxford University Press. 

Lakoff, G. & M. Johnson (1999). Philosophy in the Flesh. New York: Basic Books. 

Meyer-Sickendiek, B. (2012). Lyrisches Gespiir. Vom geheimen Sensorium moderner Poe- 
sie. München: Fink. 

Müller, Wolfgang G. (1981). Implizite BewufStseinsdarstellung im behavioristischen Ro- 
man der zwanziger und dreißiger Jahre: Hammett, Chandler, Hemingway, Amerika- 
studien, 26, 193-211. 

Nanny, Max (1988). Ezra Pound: Right Brain Poet. Paideuma, 17.2/3 (Fall/Winter), 
93-109. 

Niemeier, S. (2011). Culture-specific concepts of emotionality and rationality. In M. 
Callies, W. R. 

Keller & A. Lohöfer (Eds.), Bi-Drectionality in the Cognitive Sciences. Amsterdam, Phila- 
delphia: Benjamins, 43-56. 

Oately, K. (1992). Best Laid Schemes: The Psychology of Emotions. Cambridge etc.: 
Cambridge University Press. 

Oately, K. (1994). A taxonomy of the emotions of literary response and a theory of 
identification in fictional narrative. Poetics, 23, 53-74. 

Pezzulo, G., L. W. Barsalou, Cangelosi L. W., A., Fischer, M.A., McRae, K. & M. Spivey 
(2011). The mechanics of embodiment: A dialogue on embodiment and computa- 
tional modeling. Frontiers in ognition, 2(5), 1-21. 

Ratcliffe, M. (2008). Feelings of Being. Phenomenology, Psychiatry and the Sense of 
Reality. Oxford: Oxford University Press. 

Richardson, A. (2010). Facial Expression Theory from Romanticism to the Present. In 
Zunshine (Ed.), 65-83. 

Schrott, R. & A. Jacobs (2011). Gehirn und Gedicht. Wie wir unsere Wirklichkeiten kon- 


struieren. Hamburg: Hanser Verlag. 


80 


Motion and Emotion 


Shakespeare, W. (2005). Hamlet. Englisch-deutsche Studienausgabe. Ed. N. Greiner, W. 
G. Müller. Tübingen: Stauffenburg Verlag. 

Shakespeare, W. (1971). Macbeth. Ed. K. Muir. Arden Edition. London: Methuen. 

Shelley, P. B. (1970). Poetical Works. London, Oxford, New York: Oxford University 
Press. 

Starr, G. (2010). Multisensory Imagery. In Zunshine (Ed.), 275-291. 

Stockwell, P. (2002). Cognitive Poetics. London, New York: Routledge. 

Williams, W. C. (1951). The Collected Later Poems. New York: New Directions. 

Wilson, M. (2002). Six views of embodied cognition. Psychonomic Bulletin & Review, 
9, 625-636. 

Wolman, B. B., Ed. (1973). Handbook of General Psychology. Englewood Cliffs: Prentice- 
Hall. 

Zunshine, L. (2006). Why We Read Fiction: Theory of Mind and the Novel. Columbus: 
Ohio State University Press. 

Zunshine, L., Ed. (2010). Introduction to Cognitive Cultural Studies. Baltimore: The Johns 
Hopkins Press. 


81 


THE DIVERSITY OF SENSORY-MOTOR 
CONCEPTS AND ITS IMPLICATIONS 


GERARD STEEN 


Sensory-Motor Concepts and Metaphor in Usage 


Gerard Steen 
University of Amsterdam, 
Faculty of Humanities g.j.steen@uva.nl 


Abstract 

This paper explores the relation between metaphor and Sensory Motor concepts in lan- 
guage use. Sensory Motor concepts in language use are defined as a number of semantic 
fields distinguished by WMatrix, comprising Sensory lexis and Motor lexis, including 
words under ‘Sight’ and ‘Sound’ as well as ‘Moving, Coming, Going’ and ‘Pushing, 
Putting, Pulling’. The incidence of this lexis and its metaphorical use is examined in 
the VU Amsterdam Metaphor Corpus, a 190,000 word selection from BNC Baby anno- 
tated for metaphor. The relation between the selected semantic fields and metaphorical 
and non-metaphorical use reveals a substantial distinction between the metaphorical use 
of Sensory Motor lexis and all other lexis as well as between the metaphorical use of Sen- 
sory lexis and Motor lexis. Interactions with word class and with genre are also explored, 
indicating more specific behavior of each of the various groups of lexis expressing the 
distinct concept categories. The paper concludes by suggesting that Sensory-Motor con- 
cepts may indeed play a special role in metaphorical language use, and that additional 
distinctions are needed to capture the four-way interaction between metaphor, word 
class, register and semantic field. 

Keywords: Sensory-Motor concepts, semantic fields, metaphor, language use 


1 Introduction 


How are Sensory-Motor concepts expressed in language? And when are Sensory-Motor 
concepts used metaphorically in language? I will explore these questions in order 
to offer some tentative views of the relation between Sensory-Motor concepts and 
metaphor in usage. The connection between Sensory-Motor concepts and metaphor 
is natural since Sensory-Motor concepts afford one of the most popular source do- 
mains for generating metaphorical language and thought: according to the influential 
cognitive-linguistic account of metaphor launched by Lakoff and Johnson (1980), we 
think of for instance understanding as a sensory experience (UNDERSTANDING IS SEE- 
ING) and of change as a motor experience (CHANGE IS MOTION). More recently, one 
basic group of metaphors, called ‘primary metaphors’, have been distinguished on the 
basis of their immediate grounding in embodied cognition by means of so-called ‘image 


schemas’, which are presumably derived from sensory-motor experience (e. g., Gibbs, 
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2006; Hampe, 2005). Since then, Sensory-Motor concepts have been taken as funda- 
mental to figuration in thought and language (e. g., Lakoff and Johnson, 1999; Mandler, 
2004). 

In this paper I will utilize a substantial set of generally representative linguistic 
data to explore the relation between Sensory-Motor concepts and metaphor in usage. 
Previous work done in our lab led to the first attempt at an encompassing corpus- 
linguistic description of the relation between metaphor and its use in language (Dorst, 
2011; Herrmann, 2013; Kaal, 2012; Krennmayr, 2011; Pasma, 2011; Steen et al., 2010a, b). 
This research on metaphor in usage has shown a highly varied distribution of metaphor 


across registers and word classes: 


. Some registers are more metaphorical than others, ranging from academic and 
news through fiction to conversation. 

. Some word classes are also more metaphorical than others, ranging from prepo- 
sitions and determiners through nouns and verbs to adjectives and adverbs. 

e And some word classes are more metaphorical in some registers than in others; 
for instance, adjectives have higher metaphorical usage in news, fiction and con- 
versation than may be expected by chance, but not in academic texts, where they 


do behave according to chance (Steen et al, 2010a: 211). 


Since, in addition, some word classes are more frequent in some registers than others (cf. 
Biber and Conrad, 2009), the underlying general interaction between register and word 
class needs to be taken into account when interpreting the relation between metaphor, 
register and word class. 

These patterns were determined without paying explicit attention to their relation 
to distinct semantic fields. The data do naturally include the use of all semantic fields 
that can be distinguished, including those fields presumably relating to Sensory-Motor 
concepts. This means that, in theory, the relation between Sensory-Motor concepts and 
metaphor in usage could be analyzed as a four-way interaction, between (a) Sensory- 
Motor concepts, (b) metaphor, (c) register and (d) word-class. Taking our previous 
work as a provisional startingpoint, the simplest model of this four-way interaction 
would yield a 2*2*4*8 design for analysis, with Sensory-Motor concepts having two 
levels (Sensory-Motor concept or not), metaphor having two levels (metaphor or not), 
register having four levels (academic, news, fiction, and conversation), and word class 
having eight levels (adjective, adverb, conjunction, determiner, noun, preposition, verb, 
remainder). Such a design is clearly much too complex to remain meaningful without 


further context, certainly for an exploratory paper like the present one. I am therefore 
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going to dismantle the four-way interaction into a number of components that are the- 
oretically meaningful in order to achieve a first understanding of the possible relation 
between Sensory-Motor concepts and metaphor in usage. The following findings are 
hence partial and tentative, in the awareness that future research on a grander scale 
will have to take into account more complex interactions as possibly influencing the 
general trends. 

The overall aim of this exploration is to sketch a first picture of the employment 
of Sensory-Motor concepts for metaphorical purposes in language use. Data collection 
and analysis are based on a data set that has since been corrected, requiring another 
round of research in order to take these corrections into account. I have also selectively 
applied just a handful of small-scale statistical tests that ideally need inclusion in a more 
encompassing and sophisticated approach in the future. What I aim to do in this paper, 
therefore, is to present a relatively informal account of the most important tendencies in 
the data that are visible in spite of the error and noise I just acknowledged. Since these 
most important tendencies are starkly visible, future research is not expected to have 
drastic effects on the present conclusions and is hoped to profit from the first sketch 


and new questions I can offer at this moment. 


2 Method 


The data were collected from the VU Amsterdam Metaphor Corpus (Krennmayr and 
Steen, in press), a sample of just under 190,000 words from the BNC Baby, which itself is 
a four-million word sample from the British National Corpus. This is a 100 million word 
collection of samples of written and spoken language from a wide range of sources, 
representative of present-day British English. The VU Amsterdam Metaphor Corpus 
(from now on, ‘VUAMC’) was annotated for metaphor, yielding about 25,000 metaphor 
related words (13.6%). These were then analyzed for relations with word class and 
register, revealing a three-way interaction between metaphor, word class, and register 
(Steen et al., 2010a, b). The version of the database used for the present paper still 
includes a number of mistakes, both in Part-of-Speech tagging as well as in metaphor 
annotation. These were since corrected for a second, revised edition but the figures 
presented here are adequate enough to be representative for a first exploration of the 
trends discovered. 

All separate VUAMC text files were concatenated into four long files organized by 


register: academic texts, news texts, fiction, and conversation. Each of these files was 
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uploaded into WMatrix, a web interface including a tool for semantic field identification 
(Rayson, 2009). The semantic fields distinguished in WMatrix are applied in its lexicon 
which describes the various senses of the distinct words in the English language that 
have been included. Words in a text that is uploaded can thus be automatically analyzed 
for the semantic domains that WMatrix attaches to the lexical units. WMatrix makes 
a distinction between 21 broadly defined semantic fields, including M, ‘movement, lo- 
cation, travel and transport’, and X, ‘psychological actions, states, and processes’, with 
additional subcategories. Six Sensory-Motor domains were deemed of highest inter- 
est to the exploratory purposes of this study: M1, ‘Moving, Coming, and Going’, M2, 
‘Pushing, Putting, and Pulling’, and M6, ‘Location and Direction’, as well as X3, ‘Gen- 
eral Sensory’, X3.2, ‘Sound’, and X3.4, ‘Sight’. Lexical items representing these domains 
include leave, turn, walk (M1), take,place, hold (M2), to, in, there, where (M6), feel, feeling, 
experience, sense (X3), hear, sound, noise (X3.2), and see, look, eye (X3.4). It should be 
noted that all of these classifications are based on independent work done for WMatrix 
by Paul Rayson and his associates (Rayson, 2009). I hence take on board any decisions 
they have made in assigning particular lexemes to particular semantic fields and con- 
ceptual categories. For instance, it is self-eveident that these decisions have to do with 
the value of lexical units in the present-day system of the English language and ignore 
their historical provenance, even though this may be relevant for other research pur- 
poses. It is only by exploiting the tool as it is available now in empirical work in specific 
areas like the one reported here that constructive criticism can be formulated and the 
tool can be improvement for future work. 


An example of the output of WMatrix for one sentence is given below: 


0000025 010 AT The Z5 

0000026 010 MC2 1990s T1.3 N1 T3 

0000027 010 VHO have Z5 A9+ A2.2 S4 

0000028 010 VVN witnessed X3.4 G2.1 A10+@ S9 
0000029 010 AT1 a Z5 

0000030 010 NN1@ shift A2.1+ $54+¢ T1.3/13.1 
0000031 010 II in Z5 

0000032 010 AT the Z5 

0000033 010 NN1 art C1 X9.1+ 

0000034 010 NN1 establishment T2+ Hie :Gl..le I3..1¢ 
0000035 010 GE 's Z5 

0000036 010 NN2 attitudes X2.1/E1 

0000037 010 TI towards Z5 

0000038 010 NN1 art Cl X9.1+ 

0000039 010 VVN produced A2.2 A1.1.1 A10+ K4 K3 94.3 F4 
0000040 010 II21 outside M6[i2.2.1 A1.8-[12.2.1 Z5 
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0000040 020 II22 of M6[i2.2.2 A1.8-[12.2.2 
0000041 010 APPGE its z8 

0000042 010 JJ traditional S1.1.1 A6.2+ T3++ 
0000043 010 NN2 parameters A1.7+ N3.1 N2 


0000044 001 


Case numbers are followed by clause identifiers and Part-Of-Speech tags for the 
relevant lexical unit located in the fourth column. Each lexical unit is then followed 
by the list of semantic field tags assigned to it by WMatrix. If a word is tagged as M1, 
M2, or M3 or X3, X3.2 or X3.4, as is the case for units 028, witnessed, and 040, outside/of, 
it is included in our study as expressing a Sensory-Motor concept. 

A special feature called ‘domain push’ was activated for the selected domains. The 
domain push function enables identification of all lexical units that have these semantic 
domains, even when these semantic domains are not the relevant sense in context. The 
latter is clearly important for the identification of those words that are used in abstract 
senses in the current context but in concrete Sensory-Motor senses in other contexts. 

All WMatrix output was visually inspected and a small set of overt errors were ad- 
justed or removed. The data were then included in an SPSS database containing the 
general VUAMC information, including register and text identification, word class in- 
formation, and metaphor information. This database was subjected to a small number 
of non-parametric statistical analyses by means of the chi-square test in order to exam- 
ine first associations between a number of selected variables for portions of the data. 
A more sophisticated and encompassing quantitative analysis is envisaged for future 


research. 


3 Results 


3.1 Sensory Concepts, Motor Concepts, and Metaphor 


Sensory concepts and Motor concepts in this study each comprise three subcategories, 
which may or may not display their own specific behavior in relation to metaphor. 
That is what we will examine in this section. We now first turn to the group of Sensory 
concepts, divided into three categories: General Sensory concepts, Sound concepts and 
Sight concepts. Their relation to metaphorical use is displayed in table 1. 

There are 2,162 words in the VUAMC (N = 186,688) that are connected to the three 
selected Sensory domains, which is just over one percent. There is substantial variation 


between the three Sensory concepts as a whole: Sight concepts (n = 1,193) comprise 
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Non-metaphor Metaphor Total 
General sensory 322 191 513 
(62.8) (37.2) (100.0) 
Sound 348 108 456 
(76.3) (23.7) (100.0) 
Sight 843 350 1193 
(70.7) (29.3) (100.0) 
Total 1513 649 2162 
(70.0) (30.0) (100.0) 


Tab. 1: Frequencies (and row percentages) of three types of Sensory words, divided by non-metaphorical and 
metaphorical use 


more than half of all Sensory concepts, while General Sensory concepts (n = 513) and 
Sound concepts (n = 456) account for the other other half in roughly equal measure. 
The relation between the three concept types and metaphor is significant (X22) = 21.68, 
p= < 0.001), Phi and Cramer’s V indicating a modest effect size (0.10, p < .001). Gen- 
eral Sensory concepts display a greater proportion of metaphorical usage than aver- 
age (37.2 %), while Sound concepts display a smaller proportion of metaphorical usage 
(23.7%) than average; Sight concepts are roughly average in their metaphorical use 
(29.3 %). The significant chi-square test indicates that this association between concept 
type and metaphor is statistically reliable. Since we do not have comparable figures for 
other languages and since the data as well as method of analysis are relatively specific, I 
do not want to speculate about their general significance. In the following sections we 
will take a closer look at the nature of all three sets of Sensory concepts. There we will 
make the link with their distribution across word classes and registers and attempt to 
understand how Sensory concepts relate to these essential dimensions of metaphorical 
language use. 

Irrespective of this variation it is highly evident that Sensory concepts are much 
more metaphorical than all other concepts in the VUAMC: as mentioned above, the 
complete corpus has an average of 13.6 % of metaphorical use (Steen et al., 2010a, b). 
The odds of Sensory concepts being metaphorical in language are about three times 
higher than the odds of all other concepts being metaphorical in language. The theoret- 
ical assumption that Sensory concepts may play a special and relatively frequent role in 
the grounding of metaphor in usage is hence supported by these corpus-linguistic data. 
It lends further credence to the cognitive-linguistic proposals in Hampe (2005), a col- 
lection of chapters on the relation between image schemas as the mental repository of 


Sensory-Motor experience on the one hand and abstract cognition, including metaphor- 
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ical cognition and language use, on the other. For instance, here is Mark Johnson, who 


writes: 


The principal philosophical reason why image schemas are important is that they 
make it possible for us to use the structure of sensory and motor operations 
to understand abstract concepts and draw inferences from them. The central 
idea is that image schemas, which arise recurrently in our perception and bodily 
movement, have their own logic, which can be applied to abstract conceptual 
domains. (2005: 24) 


At this point it may be useful to list the most frequent lexical units that are related 
to each of the three semantic domains of Sensory concepts and show their relation to 
metaphorical and non-metaphorical use (see table 2). It is striking that the ten most 
popular Sensory concepts for each of the three categories also account for the bulk of 
all sensory language use in the complete VUAMC: General Sensory 98 %, Sound 60 %, 
and Sight 86%, respectively. It looks as if Sensory vocabulary is not highly varied 
but limited to a small number of frequently used basic terms. It is also striking that 
most of these lexical units are verbs, with nouns coming at some distance. Sensory 
language use apparently favours expression of sense experiences as actions, processes, 
events, or states. A third observation has to do with the differentiation between words 
that are preferably non-metaphorical (e. g., tell, experience, hear, sound, ring, buzz, eye, 
watch), words that are preferably metaphorical (e. g., feel, catch, strike), and words that 
are somewhat balanced between non-metaphorical and metaphorical use (e. g., sense, 
pop, see, look). 

Thus, some Sensory language items typically appear in literal use, as may be illus- 


trated with reference to tell: 


(1) ... but how can you tell? 

(2) ... and to tell you the naked truth ... 

(3) Tell me what you want. 

(4) ... you cannot tell one from the other ... 

(5) Please, I’ve found something I must tell you. 
(6) Doctor’ll tell us. 


Other Sensory language items typically appear in metaphorical uses, such as catch (only 


9 is not metaphorical): 


(7) be up to the US and Canada to decide whether they want to face towards the 


Atlantic or Pacific or be caught between two great trading oceans 
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General Sound Sight 
Sensory (n = 456) (n = 1193) 
(n = 513) 

Not-M Met Not-M Met Not-M Met 
1 tell 200 20 hear 74 6 see 270 159 
2 feel 23 102 sound 51 2 look 226 64 
3 experience 35 1 ring 24 4 eye 63 6 
4 sense 13 23 buzz 22 O watch 65 3 
5 catch 9 22 strike 5 15 view 18 33 
6 feeling 10 7 pop 11 8 miss 33 11 

7 suffer 12 3 listen 15 O notice 32 
8 distinguish 5 5 noise 13 1 stare 13 3 
9 greet 7 2 silence 12 0 glance 12 1 
10  make+out 3 1 meow 11 0 observe 9 4 
Total 317 186 238 36 741 286 


Table 2: Lexical units and frequencies of top 10 Sensory concepts in non-metaphorical (‘Not-M’) and metaphorical 


(‘Met’) use 


he caught the stomach-turning odour of decay 

The people who get caught and imprisoned may not be a representative picture 
of all criminals 

Delaney’s stillness caught the attention of the others 

She did and caught her breath 


And yet other Sensory language items appear to be equally eligible for non-metaphorical 
(12 and 15) and metaphorical (13 and 14) use: 


(12) 


(13) 


(14) 
(15) 


Because of this he had never seen the Oxford and Cambridge boat race until this 
year 

They see themselves not as author and illustrator with separate roles but as a 
partnership of book-makers 

so then I’m sure my colleagues will see the point of that 


Otherwise the best place to see working trams has been the tram museum at Crich 


Taken as a whole, all Sensory language seems to be roughly equally useful for the 


designation of concrete, genuine Sensory experiences as for more abstract experiences 


that are metaphorically expressed by means of Sensory vocabulary. This is typically not 


the case for all metaphor since the average proportion of all metaphorical language is 


13.6%. At the same time, within this group, there is also some division of labour be- 


tween non-metaphorical and metaphorical designation: some words seem to specialize 


into one direction whereas others prefer another direction, as was illustrated just now. 
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Worthy of note is the fact that the top 10 for Sound displays only 13.1 % metaphorical 
use; this suggests that the higher figure for metaphorical use for the complete Sound 
concept category is due to the remaining set of lexical units, which are used much less 
frequently than the ones in the top ten. These must be a different type of words, or so it 
seems, since they are used metaphorically more frequently. Further research will have 
to delve into this possible differentiation. 

We now turn to the other main group of Sensory-Motor concepts, the Motor con- 
cepts. These also comprise three main categories for the purposes of this study: (a) 
Moving, Coming, and Going; (b) Pushing, Putting and Pulling; and (c) Location and Di- 
rection. Their association with metaphorical versus non-metaphorical use is displayed 
in table 3. 

Motor concepts are much more frequent than Sensory concepts, exhibiting 24,353 in 
the data, which amounts to some 13 % of the entire VUAMC corpus. There is substantial 
variation between the incidence of the three distinct groups of Motor concepts: Location 
and Direction concepts comprise 72.9% of all Motor concepts, while Moving, Coming 
and Going account for 17.1% and Pushing, Putting and Pulling, for 10%. The relation 
between these three distinct Motor concept categories and metaphor is significant (X 22) 
= 51.43, p < 0.001), Phi and Cramer’s V revealing a small effect size (0.05). The Pushing, 
Putting and Pulling category has a greater proportion of metaphorical use (almost one 
in two) than the other two categories (just over one in three for Moving, Coming, and 
Going, and two in five for Location and Direction), which explains the statistically 


significant relation between concept category and metaphor. 


Non-metaphor Metaphor Total 
Moving, 2,553 1,599 4,166 
Coming, 61.5) (38.5) (100.0 
Going 
Pushing, 1,278 1,135 2,423 
Putting, 53.0) (47.0) (100.0 
Pulling 
Location, 10,737 7,051 17,788 
Direction 60.4) (39,6) (100.0 
Total 14,568 9,785 24,353 
59.8) (40.2) (100.0) 


Table 3: Frequencies (and row percentages) of three types of Motor words, divided by metaphorical and non- 
metaphorical use 
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As a group, Motor concepts are much more frequently metaphorical than all other 
concepts, given the overall average of 13.6 % of all metaphorical use. The odds of Motor 
concepts being metaphorical in language use are no less than four times higher than the 
odds of all other concepts being metaphorical in usage. The theoretical assumption that 
Motor concepts may play a special role in the grounding of metaphor in usage is hence 
also supported by these corpus-linguistic data. 

Below we will take a closer look at the nature of all three sets of Motor concepts in 
order to elucidate why Location and Direction is so much more frequent than the other 
groups. But a first indication of an answer may be provided by taking a look at the top 


10 most frequent Motor concepts in metaphorical and non-metaphorical use (table 4). 


Moving, Pushing, Location, 
Coming, Going Putting, Pulling Direction 
(n = 4,166) (n = 2,423) (n = 17,788) 
Not Met Not Met Not Met 
1 get 468 243 take 83 222 to 3475 1025 
2 go 551 146 place 180 86 in 1026 1904 
3 come 149 121 put 86 112 for 1417 - 
4 leave 79 47 move 56 29 on 323 780 
5 move 56 29 turn 50 35 there 808 37 
6 turn 50 35 hold 29 43 this 98 703 
7 — walk 78 4 bring 32 31 by 716 67 
8 run 20 45 lead 18 45 about 12 394 
9 follow 8 47 pull 40 10 right 266 3 
10 return 30 22 set 17 29 where 188 62 
Total 1489 539 581 642 8329 4975 


Table 4: Lexical units and frequencies of top 10 Sensory concept, divided by non-metaphorical and metaphorical 
uses 


The ten most popular Motor words within each category account for the follow- 
ing percentages of all Motor language use in the complete VUAMC: Moving, Coming, 
Going 48.7%, Pushing, Putting, Pulling 50.5 %, and Location and Direction 74.8 %, re- 
spectively. In comparison with Sensory vocabulary, the first two Motor vocabulary 
categories (Moving, Coming, Going, and Pushing, Putting, Pulling) turn out to be much 
more varied, the top ten lexical units accounting for about half of the number of cases 
in the corpus. Location and Direction is more limited to a smaller number of frequently 
used basic terms. 

The latter may clearly be related to the strikingly high numbers of prepositions, 


adverbs, and demonstratives emerging in that category, which recur throughout the 
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data, with a total lack of verbs and nouns. Motor language involving Location and 
Direction most frequently concerns expression of sense relations between entities and 
processes, whereas Motor language involving movement and exerting force is more like 
the Sensory concepts and concerns actions, processes, events and states predominantly 
designated by verbs and their nominal derivations. 

A third observation that can be made has to do with the different distribution than in 
Sensory words, which are preferably non-metaphorical, metaphorical or mixed. Most 
Motor language is roughly equally useful for the designation of concrete Motor ex- 
periences as for other experiences that are metaphorically derived and expressed by 
means of Motor vocabulary. Note that the lack of metaphorical use of ‘for’ is an artefact 
of the annotation method used in our corpus analysis, where both ‘of’ and ‘for’ were 
taken as too semantically bleached to display reliably recognizable contrasts between 
non-metaphorical and metaphorical uses (Steen et al., 2010a). 

It should be noted that the top 10 for Moving, Coming, Going displays only 26.7 % 
metaphorical use; this suggests that the higher figure for metaphorical use for the com- 
plete Moving, Coming, Going concept category of 38.5 % is due to the remaining set of 
lexical units that are used much less frequently but, apparently, more often metaphor- 
ically. As with the Sound category above, this may be a different type of words meriting 
further exploration. Another interesting observation is the fact that the top 10 Pushing, 
Putting and Pulling words are used more frequently metaphorically than not metaphor- 
ically. This is a unique finding so far and also requires further inspection in the future. 
Both of these findings in this exploratory study suggest important avenues for further 
research. 

There is a substantial difference between the frequencies of Sensory concepts and 
Motor concepts in all of the data, Motor concepts occurring about eleven times as fre- 
quently as Sensory concepts. Is it possible that this is an indication that motion is less 
abstract and even more basic, as it were, than sensory experience, which typically in- 
volves some associated form of cognitive response (cf. Grady, 2005)? We have also seen 
that both Sensory concepts and Motor concepts interact with metaphor in different 
ways than all other concepts: both Sensory and Motor concepts are much more fre- 
quently used metaphorically in language than all other concepts, while Motor concepts 
are even more frequently metaphorical than Sensory concepts. There also appears to be 
a substantial difference between the frequencies of the various subcategories of both the 
Sensory concepts and the Motor concepts, with additionally variable relationships with 


metaphorical usage: there is a rank order from General Sensory through Sight to Sound 
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concepts which differ significantly from each other in their propensity for metaphorical 
use; and there is a three-way distinction between Pushing Putting and Pulling (highly 
metaphorical) versus the other two Motor categories (less highly metaphorical), one 
of which, however (Location and Direction) is different from the other (Moving, Com- 
ing and Going) on account of its extraordinarily high overall frequency as well as its 
different types of word classes in the top ten. In other words, almost every Sensory- 
Motor category behaves differently than the other ones, suggesting that each type of 
Sensory-Motor concept has properties of its own. 

This warrants taking a closer look at the nature of each subcategory of Sensory- 
Motor concepts in order to try to understand why Motor concepts may be so much 
more frequent than Sensory concepts, why Motor concepts invite metaphorical use 
more often than Sensory concepts, and what may be the causes behind the different fre- 
quencies of each of the subcategories of Sensory-Motor concepts with further variable 
metaphorical use within Sensory concepts and Motor concepts as main groups. Ten- 
tative explanations of these observations will be sought now by examining the nature 
of word classes of the metaphorical and non-metaphorical uses of the various Sensory- 
Motor concept categories (section 3.2) and their relation to the four registers of aca- 


demic texts, news text, fiction and conversations (section 3.3). 


3.2 Sensory-Motor Concepts, Metaphor and Word Class 


Can the high metaphorical usage of the Sensory concepts and even more of the Motor 
concepts in comparison with all other concepts be understood with reference to particu- 
lar word classes? Since previous work has shown a relationship between metaphor and 
word class, word class variation between Sensory-Motor concepts and Other concepts 
may also play a role in the variable metaphorical use of the three groups of concepts. 
It is the aim of this section to explore this relationship impressionistically for the most 
obvious understandable patterns. We shall also examine whether these main effects 
of word class on metaphorical usage of Sensory-Motor concepts are compounded by 
further interactions with subcategories of each Sensory-Motor concept or not. If there 
are interactions, the overall picture needs further refined and a more differentiated in- 
terpretation. I will therefore now check the relation of word class and metaphor to each 
of the three separate subcategories of Motor concepts and of Sensory concepts. 

For this purpose, only the metaphorical uses of the General Sensory concepts, Sound 


concepts, and Sight concepts in our data will be related to word class (Adjective, Adverb, 
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Determiner, Noun, Preposition, Verb, and Remainder). Table 5 displays the findings. 
Frequencies and percentages only indicate the proportion of metaphorical use within 
a word class for a particular Sensory category, all non-metaphorical uses having been 


omitted from the table. 


Adj Adv Noun Verb Remain Total 
General Sensory 0 - 34 157 ~ 191 
(n = 514) (0) 39.5) 36.9) (37.2) 
Sound 13 0 33 62 0 108 
(n = 456) (40.6) (0) 23.4) 26.4) (0) (23.7) 
Sight 3 0 84 263 - 350 
(n = 1,198) (15.8) (0) 28.7) 29.9) (29.3) 
Total 16 0 151 482 0 649 
(N = 2,162) (30.8) (0) 29.0) 31.3) (0) (30.0) 


Table 5: Frequencies (and percentages) of metaphor related words per word class for three groups of Sensory 
concepts 


Systematic statistical analysis by means of a series of comparable chi-square tests 
was not feasible because of the number of cells with zero observations, and collapsing 
categories would have led to complications. But visual inspection confirms that Verbs 
and Nouns account for the bulk of the data (in total 482 Verbs plus 151 Nouns makes 
633 out 649), with Verbs occurring about three times as often as Nouns. In itself this 
is a remarkable proportion, as in general verbs display 18.7 % metaphorical usage, as op- 
posed to nouns 13.3 % (e. g. Herrmann, 2013). Apparently, metaphorical uses of Sensory, 
Sight and Sound words are mostly verbal, followed at great distance by nominal, which 
is completely atypical in comparison with overall tendencies of the relation between 
word class and metaphorical use. 

Variation in metaphorical usage per Sensory category seems to be largely due to 
variation in metaphorical use of the Verb class: General Sensory concepts have the 
highest metaphorical use because Verbs account for 30.5 % of the data (157 out of 514). 
Sight concepts follow suit because metaphorically used Verbs explain 22% of the data 
(263 out of 1,198). And Sound concepts have the lowest proportion of metaphorical 
use because metaphorical Verbs comprise a mere 13.6% of the data (62 out of 456). 
Throughout these patterns, metaphorical nouns consistently account for some 7% of 
the totals and do not affect the overall score for metaphorical use in the distinct three 


Sensory categories. 
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The distribution of metaphorically used words expressing Sensory concepts hence 
mostly depends on the varying popularity of distinct categories of Sensory verbs hav- 
ing to do with General Sensory experiences, Sound, and Sight. Since Verbs as well as 
Nouns generally tend to have a higher metaphorical use than average (Steen, 2010 a, 
b), part of the high metaphorical use of the Sensory concepts is also explained by the 
fact that this category is dominated by Verbs and Nouns. However, at the same time, 
average metaphorical use of all Verbs and Nouns is substantially lower than 30 %: if this 
can be shown to be a significant difference in more encompassing statistical testing, 
this would suggest that Sensory Noun and Verbs are a special category of lexis elicit- 
ing metaphorical use more often than all other Verbs and Nouns. Sensory experience 
expressed in language may then indeed be regarded as a popular basis for metaphorical 
meaning on the basis of its ability to conceptualize the abstract via concrete embodied 
experiences. 

Let us now turn to Motor concepts and relate metaphorical use of (a) Moving, Com- 
ing and Going concepts, (b) Pushing, Putting and Pulling concepts, and (c) Location and 
Direction concepts to word class again (Adjective, Adverb, Determiner, Noun, Prepo- 
sition, Verb, and Remainder). Table 6 displays the findings in the same way as table 
5: frequencies and percentages only indicate the proportion of metaphorical use within 
a word class for a particular Motor category, all non-metaphorical uses having been 


omitted from the table. 


Adj Adv Det Noun Prep Verb Remain Total 
Moving, Coming, 9 0 - 323 - 1267 0 1599 
Going (37.5 (0.0) (43.3) (37.6) (0.0) 38.5) 
(n = 4,166) 
Pushing, Putting, 2 - - 229 - 904 0 1135 
Pulling (10.0 (39.0) (50.1) (0.0) 47.0) 
(n = 2,423) 
Location, 92 682 701 626 4615 280 55 7051 
Direction (21.9 (33.6) (87.7) (44.3) (52.1) (51.6) (1.5) 39.6) 
(n = 17,779) 
Total 103 682 701 1178 4615 2451 55 9785 
(N = 24,368) (22.2 (33.5) (87.7) (42.9) (52.1) (42.9) (1.5) 40.2) 
Table 6: Frequencies (and column percentages) of metaphor related words per word class for three groups of Motor 


concepts 


Table 6 immediately throws into relief the special role of Prepositions for all Sensory- 


Motor concept research: they increase the total metaphorical use of all Location and Di- 
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rection concepts by 4615 cases, to the strikingly high figure of 9785. Since Prepositions 
do not play a role in the other two Motor concept categories, nor in all Sensory con- 
cepts, as we have seen, Location and Direction Prepositions might have to be treated as 
a separate category. They account for almost half of the inordinately high proportion of 
metaphorical use of Motor concepts in comparison with Sensory concepts as well as all 
other concepts. This now appears to be a specific manifestation of the natural connec- 
tion between the concepts of Location and Direction on the one hand and Prepositions 
on the other. It does not appear to be characteristic of the behavior of Sensory-Motor 
concepts in general. 

Statistical analysis was not feasible without raising complications again. Yet vi- 
sual inspection shows that Location and Direction concepts display a different usage 
of Nouns and Verbs than the other two Motion concepts. Where Verbs and Nouns 
account for 99.2% of all Moving, Coming, and Going concepts as well as of all Push- 
ing, Putting and Pulling concepts (which is comparable to what happens in Sensory 
concepts), Verbs and Nouns comprise a meager 12 % in the Location and Direction con- 
cepts. Vice versa, Location and Direction is the only Sensory Motor concept category 
that makes substantial use of Adverbs and Determiners, too—as was already suggested 
by the top ten frequent words in table 4 above. Perhaps it is therefore not just Loca- 
tion and Direction Prepositions, but all Location and Direction lexis which ought to be 
treated as a separate category in the study of Sensory-Motor concepts. 

Focusing on the two remaining categories of Motor concepts, that is, Pushing, Putting 
and Pulling as well as Moving, Coming and Going, these seem to exhibit rather com- 
parable patterns of word class distribution. Both largely involve Verbs and Nouns, with 
Verbs dominating over Nouns in both categories. This is roughly comparable to the 
situation in Sensory concepts. It should not come as a surprise that both Pushing, 
Pulling and Putting concepts as well as Moving, Coming and Going concepts seem to 
be naturally related to the word class of Verbs, and this explains why a good deal of the 
metaphorical usage of these Motor concepts is related to the variable incidence of this 
one word class category. This again accounts for part of the higher metaphorical use 
of Motor concepts, given the generally high metaphorical use of verbs and nouns, but it 
also leaves another portion unexplained which apparently has to do with the specific 
nature of Motor Verbs and Nouns as apt source domains for frequent metaphorization 
of the abstract by the concrete. 

Location and Direction concepts display behavior which is not shared by the other 


two Sensory-Motor categories examined in these data. Whereas initially it seemed nat- 
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ural to include Location and Direction under Motion and Motor concepts, this may now 
require further theoretical reflection. Moving, Coming and Going concepts resemble 
Pushing, Putting and Pulling concepts when it comes to their lexical expression in us- 
age, Verbs and at some distance Nouns dominating the scene. Location and Direction 
display a completely different profile and are the only category that is heavily depen- 
dent on other word Icasses than Verbs and Nounds, with Prepositions, Adverbs and 
Determiners instead being most prevalent. 

In sum, the relation between metaphor and Sensory-Motor concepts may be partly 
explained with reference to their interaction with word class. For the 2,168 Sensory 
concepts in the corpus, there are basically just two word classes involved, Verbs clearly 
dominating the picture, accounting for almost three quarters of all Sensory concepts. 
What is more, one third of these Sensory Verbs are used metaphorically, which is an 
inordinately high percentage: Sensory Verbs apparently lend themselves to metaphor- 
ical usage very easily. Likewise, Sensory Nouns account for the remaining quarter of all 
Sensory concepts, with a proportion of over 40 % being used metaphorically, which is 
also strikingly high. 

For the 24,566 Motor concepts, we have a situation that is comparable to the Sensory 
category for two of the three Motor categories: Moving, Coming, and Going, and Push- 
ing, Putting and Pulling. There is one category that is starkly different, Location and 
Direction: there Prepositions play a deviant and prominent role, accounting for more 
than one third of all Motor concepts in the complete corpus. Moreover, metaphorical 
use of Motor Prepositions is extraordinarily high, comprising over 50% of all Motor 
Prepositions. Prepositions hence account for 4,615 cases out of all 9,785 Motor concepts 
that are metaphorical. With the additionally different behavior of Adverbs and Deter- 
miners as well as Verbs and Nouns in the Location and Direction category, a case can be 
made for separating this category from the other two Motor concepts. 

We already saw that Sensory concepts appear to be rather different than Motor con- 
cepts, but we may now add that perhaps all Sensory-Motor concepts ought to be seen 
as comprising not two but three rather distinct groups of concepts: Sensory Concepts, 
Motor concepts (including Moving, Coming, Going, and Pushing, Putting, Pulling), and 
Location and Direction concepts. This is based on the radically different relation be- 
tween the various categories and word classes. Partly as a result of this, their overall 
frequency in language use varies considerably too: 1.16 % for Sensory concepts, 3.55 % 
for Motor concepts, versus 9.55 % for Location and Direction, respectively. The inter- 


action between Sensory-Motor concepts and metaphor is clearly affected by the inter- 
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action between metaphor and word class. Apart from this, in the other two Motor con- 
cept categories, Verbs and Nouns are more frequently metaphorical in comparison with 
Sensory Verbs and Nouns (roughly over 40 % in Motor concepts versus about 30% in 
Sensory concepts)—why Motor concepts would elicit more metaphorical use than Sen- 
sory concepts is an intriguing question. With a reference to Grady (2005), I have raised 


the question whether they might be less abstract and involve less mental response. 


3.3 Sensory-Motor Concepts, Metaphor and Register 


Can the relatively high metaphorical usage of the Motor concepts and the Sensory 
concepts be related to the increased use of Sensory-Motor concepts in specific registers, 
in comparison with other concepts? Since previous work has shown a relationship 
between metaphor and register, register variation in Sensory-Motor concepts may also 
interact with the metaphorical use of various groups of concepts. We shall now see 
whether these main effects of register on metaphorical usage of Sensory-Motor concepts 
can be refined by checking each of the separate subcategories of Motor concepts and 
Sensory concepts. We shall begin with the Sensory concepts again. 

The overall distribution of the Sensory concept lexis across the four registers turns 
out to be very uneven. In the complete VUAMC corpus, the four registers are about 
equally large, averaging about 47,000 words each, which would predict a 25% division 
of the Sensory concepts across the registers by chance. This is not the case: Fiction has 
a high 40 % of all Sensory concepts, followed by Conversation, which is close to average 
with 28.1 %, while News (16 %) and Academic texts (15.9 %) are low. One interpretation 
of this finding is that Fiction has an emphasis on Sensory experience that is there for 
artistic reasons, making experience more palpable, as opposed to the more abstract 
concerns of News and Academic texts. 

Table 7 displays the frequencies and percentages of only the metaphorical words 
per register for each of the three Sensory concept categories. The overall pattern of 
metaphorical usage in the complete VUAMC corpus manifested the following percent- 
ages for all lexis, Sensory-Motor and otherwise: Academic 18.5%, News 16.4 %, Fiction 
11.7%, and Conversation 7.7 % (Steen et al., 2010a, b). From the previous sections we 
already know that there is a higher percentage of metaphorical use for Sensory con- 
cepts than average, but now we can observe two further conspicuous differences when 
we turn to the relation between metaphor and register for Sensory concepts. First of 


all, there seems to be a split between Academic and News texts on the one hand and 
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Fiction and Conversation on the other, with Academic and News texts having double 
or more than double the number of metaphorical uses than Fiction and Conversation. 
And secondly, where Sensory concepts in Academic and News texts are in the same 
ordering from more to less metaphorical as may be observed for all concepts, Fiction 
and Conversation are in roughly the same position, Fiction having less metaphor and 
Conversation having more metaphor than expected when compared with the general 
pattern in the complete corpus. Upon close inspection this is solely due to what hap- 
pens in the Sight category, which exerts a relatively great effect on the overall patterns 
because it accounts for half of all Sensory concept cases: in the General Sensory and 
Sound categories, the rank order between the registers regarding metaphorical use is 
in accordance with the overall pattern in the complete corpus. What we are dealing 
with, therefore, is a three-way interaction between metaphor, concept category and 
register, which moreover has to be seen against the background that Sensory concepts 
are proportionately much less frequent in Academic and News texts as opposed to Fic- 
tion where they are much more frequent. The relation between Sensory concepts and 
metaphor in usage is thus rather complicated when we examine it from the perspective 


of genre, which clearly affects their interaction. 


Academic News Fiction Conversation Total 
General Sensory 44 46 66 35 191 
(n = 514) (46.8) (52.3 (33.5) (26.1) 37.2) 
Sound 19 35 40 14 108 
(n = 456) (59.4) (36.1 (19.3) (11.7) 23.7) 
Sight 131 68 70 81 350 
(n = 1,198) (60.1) (42.0 (15.2) (22.9) 29.3) 
Total 194 149 176 130 649 
(N = 2162) (56.4) (42.9 (20.4) (21.4) 30.0) 


Table 7: Frequencies (and percentages) of metaphor related words per register for three groups of Sensory concepts 


For each of the three Sensory concepts, the relation between metaphor and genre 
was tested by means of a two-way chi-square test of significance. All tests returned 
significant results: for General Sensory concepts, X23) = 20.46, p < 0.001, Phi and 
Cramer’s V = 0.20; for Sound concepts, X23) = 42.57, p < 0.001, Phi and Cramer’s V 
= 0.31; and for Sight concepts, X23) = 163.14, p < 0.001, Phi and Cramer’s V = 0.37. 
Standardized residuals revealed significant effects of the categories furthest removed 
from the expected frequencies, such as high metaphoricity in News for general Sensory 


concepts, high metaphoricity in Academic texts and News texts for Sound, and high 
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metaphoricity for Academic texts but low metaphoricity for Fiction in Sight. The most 
prominent differences between registers manifested for metaphor in each of the three 
Sensory concept categories are statistically reliable. 

For each of the four genres, the relation between metaphor and Sensory concept 
category was also tested by means of a two-way chi-square test of significance. Two 
tests returned significant results: for Fiction, X22) = 28.61, p < 0.001, Phi and Cramer’s V 
= 0.18; and for Conversation, X2) = 9.03, p = 0.01, Phi and Cramer’s V = 0.12. Standard- 
ized residuals revealed significant effects of the metaphorical use of Sound categories 
in Conversation, which is extremely low compared with the other two concept types 
in Conversation; of metaphorically used General Sensory concepts in Fiction, which is 
very high within Fiction, as well as of metaphorically used Sight concepts in Fiction, 
which is low within Fiction. For Academic texts and News texts, chi square was not 
significant, although revealing a tendency towards significance (p < 0.1): all Sensory 
concept categories are used in roughly comparable measure in both of these registers. 

For Sensory concepts, we see a clear split between registers. The abstract registers 
of Academic and News texts have a comparatively low percentage of Sensory concepts 
that at the same time are used metaphorically relatively very often. In Academic texts, 
Sensory concepts are used metaphorically even more than half of the times, which is 
a unique finding. The more concrete registers of Conversation and Fiction have an 
understandably high proportion of Sensory concepts that at the same time are used 
metaphorically much less frequently than in Academic and News Texts, making Con- 
versation and Fiction even more concrete. For instance, in our data the verb to feel is 
used non-metaphorically only in Conversation and Fiction (feel the cold, feel warm), not 
in Academic and News, where it is always used metaphorically. It is also true, however, 
that Sensory concepts in Fiction and Converation are still used metaphorically twice 
as often as all metaphorical lexis taken together in the entire VUAMC corpus: in the 
overall corpus, Conversation has 7.7 % metaphor, and Fiction 11.7 % metaphor, whereas 
for Sensory language use, these percentages climb to over 20 %. This may also be due to 
the relative frequency of such constructions as feel anxious, guilty, uneasy, and so on, 
which feature quite large in Conversation and Fiction. All of this is still a powerful indi- 
cation that Sensory concepts do play a special role in affording metaphorical language 
and perhaps conceptualization. 

We will now do the same analysis for Motion concepts. We will relate metaphor- 
ical use of (a) Moving, Coming and Going concepts, (b) Pushing, Putting and Pulling 


concepts, and (c) Location and Direction concepts to the four registers. Table 8 dis- 
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plays only the metaphorical frequencies and percentages of the Motion concepts (with 


metaphorical and non-metaphorical totals listed under n in the first column). 


Academic News Fiction Conversation Total 
Moving, Coming, 410 460 343 386 1599 
Going (70.3) (55.1 (30.4) (24.1) 38.5) 
(n = 4,166) 
Pushing, Putting, 347 384 265 139 1135 
Pulling (65.7) (61.6 (34.6) (28.1) 47.0) 
(n = 2,423) 
Location, 2786 1933 1311 1021 7051 
Direction (54.0) (41.9 (31.4) (26.6) 39.6) 
(n = 17,779) 
Total 3543 2777 1919 1546 9785 
(n = 160,167) (56.5) (45.8 (31.6) (26.1) 40.2) 


Table 8: Frequencies (and percentages) of metaphor related words per register for three groups of Motor concepts 


In contrast with the Sensory concepts, the overall distribution of Motor concept lexis 
across the four registers is even. The percentages of Motor concepts across the four 
registers of Academic texts, News texts, Fiction, and Conversation are 25.8, 24.9, 25.0, 
and 24.4, respectively. This is in accordance with the size of the four sub corpora, and 
according to what might be expected according to chance. It throws into relief the 
special value of the previous finding of the uneven distribution of Sensory concepts 
and suggests that there may be a difference between the roles of Sensory and Motor 
concepts that needs to be examined more closely. 

The overall rank order of metaphorical usage across genres in the complete corpus 
is also reflected in the distribution of the Motor concepts across the four genres: Aca- 
demic has the highest percentage (56.5), followed by News (45.8) and Fiction (31.6), with 
Conversation at the low end of the scale (26.1). We already knew that there is a higher 
percentage of metaphorical use for Motor concepts than average, but we can now see 
that this holds for all registers, and that the mutual difference in metaphorical usage 
between the four genres may be somewhat greater than for all metaphor use. This will 
have to be examined in future research with more encompassing statistical tests. 

Next, when we examine the difference between Location/Direction concepts and the 
other two sets of Motor concepts, it looks as if there is an interaction between con- 
cept type and register: both Academic texts and News texts display a rather high fre- 
quency of metaphorically used Moving, Coming and Going concepts as well as Pushing, 


Putting, and Pulling concepts, while all other concepts seem to be distributed across the 
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four registers according to chance. Two series of two-way statistical tests by means of 
chi-square showed whether these first impressions were reliable. 

For each of the three Motor concepts, the relation between metaphor and genre 
was tested by means of a two-way chi-square test of significance. All tests returned 
significant results: for Moving, Coming, and Going, X22) = 519.23, p < 0.001, Phi and 
Cramer’s V = 0.35; for Pushing, Putting and Pulling, X22) = 246.69, p < 0.001, Phi and 
Cramer’s V = 0.32; for Location and Direction, X29) = 843.75, p < 0.001, Phi and Cramer’s 
V = 0.22. Standardized residuals revealed significant effects of all categories in each of 
the two-way interactions, suggesting that no single category crossing two variables 
behaved according to expectation by chance. 

For each of the four genres, the relation between metaphor and Sensory concept cat- 
egory was also tested by means of a two-way chi-square test of significance. Two tests 
returned significant results: for Academic, X22) = 77.24, p < 0.001, Phi and Cramer’s V = 
0.11; and for News, X22 = 119.52, p = 0.01, Phi and Cramer’s V = 0.14. Standardized 
residuals revealed significant effects of all categories in each of these two two-way in- 
teractions. For Fiction and Conversations, chi square was not significant, although for 
Conversation a tendency towards significance was revealed (p < 0.1). 

In sum, each of the registers differs from the others when it comes to their use of each 
of the distinct Motor concepts. Moreover, Academic and News texts display different 
usages of each of the three Motor concepts within their own register. In Academic texts, 
there is a stunning 70% of metaphorical usage of Moving, Coming, and Going lexis, 
followed by 65.7% of metaphorical usage for Pushing, Putting, and Pulling. In News 
texts, Pushing, Putting and Pulling leads the way, with 65.1%, followed by Moving, 
Coming and Going, with 55.1%. Examples would include metaphorical uses of take in 
academic writing such as take issue with, take an example, take a more mature attitude, 
take note of, take the view, and so on. This is to be contrasted with metaphorical usage 
of both concept categories in both Fiction and Conversation, where percentages range 
between 24.1 % and 34.6 %. The verb take is used in those registers relatively more often 
as a verb that involves the taking of a concrete object. Location and Direction have 
a much lower metaphorical percentage in Academic and News texts, while they are 
relatively comparable to the other concept categories in Fiction and Conversation. 

These are clear quantitative indications that the metaphorical use of Motor concepts 
in language cannot be treated as one uniform phenomenon, but that more work needs 
to be done on the relation between Motor concepts, metaphor, and register. A close 


examination of the cases involved is the next step that needs to be taken. 
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The relation between Sensory-Motor concepts and metaphor in language is clearly 
affected by register. Sensory concepts have an uneven distribution across registers, with 
Fiction clearly favoring Sensory concepts (in order to create a fictional world) while 
Academic and News texts do not; Motor concepts, by contrast, are evenly distributed. 
The language of fiction therefore has a higher Sensory-Motor quality than than other 
registers, while the language of Academic and News texts is less ‘Sensory-Motory. At 
the same time, Academic and News texts throughout favor metaphorical use of both 
Sensory and Motor concepts, even in absolute terms. This accords with their abstract 
nature and contrasts with the predominance of non-metaphorical use of Sensory-Motor 
terms in Fiction and Conversation. In addition, since Academic and News texts tend 
to be more metaphorical than Fiction and Conversation overall, it can now be seen 
that Sensory-Motor terms make a substantial contribution to this two-way distinction 


between the four registers. 


4 Discussion 


The relation between Sensory-Motor concepts and metaphor in usage has been on the 
agenda of cognitive linguists, psychologists, and scientists in general for some time. 
Theoretical motivation for this interest is amply available, but the present study is the 
first corpus-linguistic exploration of this relationship. Even though the study is partial 
and tentative it has revealed some new tendencies which require further scrutiny on 
the basis of more encompassing research, which is currently undertaken in our lab. 
The most important observation is that Sensory-Motor concepts on the one hand do 
display a higher degree of metaphorical use than all other concepts, but that on the other 
hand this relationship is not uniform but variable across all categories as well as groups 
of categories that can be distinguished between the Sensory-Motor concepts included in 
this study. Thus, Motor concepts are eleven times more frequent than Sensory concepts; 
Sight concepts are twice as frequent as Sound concepts and general Sensory concepts; 
and Location and Direction concepts are an entirely different group of Sensory-Motor 
concepts than all others, comprising three quarters of all Motor concepts and having 
a radically different word class profile than all other five concept categories. In partic- 
ular, all other Sensory-Motor concepts are dominated by verbal and at some distance 
nominal expression, while Location and Direction are based on prepositions, adverbs 


and demonstratives. Further research including other Sensory-Motor concepts clearly 
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needs to throw more light on the diversity of this group of concepts in order to establish 
its internal coherence. 

The second most important observation is that despite this internal variation, all 
Sensory-Motor concepts are much more often metaphorical than all other concepts. 
This is consistent with the idea that Sensory-Motor knowledge has a special role to play 
in the metaphorical conceptualization of our experience. The ground of this idea is the 
assumption that Sensory—Motor knowledge is the most specific and best-differentiated 
concrete knowledge we have which can then be used as a model for less specific, less dif- 
ferentiated more abstract knowledge, for instance about social relations and processes 
(Sight for Understanding) or temporal and abstract processes (Motion for Change). The 
details of these varying relationships can now be studied in context with reference to 
a substantial set of natural language materials. 

A third point emerging from this study is the role of register. Sensory-Motor con- 
cepts are not just more frequently related to metaphor in usage, perhaps mediated via 
obvious distinctions between word classes; these relations are also exploited to a greater 
or lesser extent in distinct situations of language use. We saw a clear distinction be- 
tween, on the one hand, the more abstract registers of Academic and News texts, and, 
on the other hand, the more concrete registers of Fiction and Conversation. Sensory 
concepts were dispreferred in the former two, but those Sensory concepts that were 
used there were massively metaphorical. Sensory concepts were preferred in Fiction 
and Conversation, but their use was much less often metaphorical than in Academic 
and News, even if it was still more metaphorical than the average metaphorical use of 
all other concepts in Fiction and Conversation. 

Motor concepts displayed a different relationship with register. They were dis- 
tributed evenly across all registers but their metaphorical use went down from Aca- 
demic through News and Fiction to Conversation. Metaphorical use of Sensory-Motor 
concepts is clearly promoted in Academic and News texts and less so in Fiction and 
Conversation. 

The relation between Sensory-Motor concepts and metaphor in usage is therefore 
no simple one. It involves a four-way interaction between Sensory-Motor concepts, 
metaphor, word class, and register. This paper has only begun to sketch the possible 
outlines of this complex picture. I hope that it will provide a useful inspiration for more 


encompassing as well as thorough and detailed work in the future. 
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Abstract 

In this article we outline a theory of action verbs that combines a modality-independent 
(or abstract) conceptual component with a modality-specific one. Verbs as concepts 
are interpreted as ranked sets of nuclei structures in the sense of Moens and Steedman 
(1988). This information is stored in the middle temporal gyrus (Bedny and Caramazza 
2011). Besides being amodal, this information is underspecified w.r.t. a particular way in 
which the action is executed (grasp a needle vs. grasp a barbell), i.e. it is not grounded in 
a particular situation. This underspecification can in general only be resolved if the type 
of object undergoing the change (needle vs. barbell) is known. Following Willems et al. 
(2009), this grounding is explained as an implicit simulation in premotor cortex, that is 
a preenactment of the action which makes it possible to predict the way in which the 
action evolves and which is distinct from explicit (motor) imagery. 


1 Theories of grounded cognition: evidence and problems! 


According to Zwaan and Kaschak (2008: 368), ‘language is a sequence of stimuli that 
orchestrate the retrieval of experiential traces of people, places, objects, events, and 
actions. They illustrate this view of language with an example taken from Barsalou 
(1999). When reading the sentence John removed an apple pie from the oven, a compre- 
hender understands this sentence by retrieving past experiences involving persistent 
objects like apple pies and ovens as well as events of removing something, for instance, 
an apple pie from an oven. These traces usually include both motor experiences such as 
lifting the pie and feeling its weight and perceptual experience like seeing and smelling 
the pie and feeling the heat coming out of the oven. Similarly, when processing the 
verb throw or the sentence Bill throws the ball, a speaker mentally simulates an action of 
throwing (Pulvermiiller 2005). On this view, ‘the understanding of action-related sen- 


tences implies an internal simulation of the action expressed in the sentences, mediated 


1 The research was supported by the German Science Foundation (DFG) funding the Collaborative Research 
Center 991. 
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by the activation of the same motor representations that are involved in their execution’ 
(Buccino et al. 2005: 361). On this view, understanding words and other linguistic items 
is based on the same neural substrate as imagining the actions and objects described by 
those linguistic expressions (Gallese and Lakoff 2005: 456). For example, Gallese and 
Lakoff argue that one can understand the sentence Harry picked up the glass only if one 
can imagine picking up a glass or seeing someone picking up a glass. This view is in 
line with the idea of Hebbian learning: neuronal correlation is mapped onto connection 
strength. As formulated by Hauk et al. (2004: 301): ‘If word forms frequently co-occur 
with visual perceptions (object words), their meaning-related activity may be found in 
temporal visual areas, whereas action words frequently encountered in the context of 
body movements may produce meaning-related activation in the frontocentral motor 
areas’. If a verb refers to actions and events that are typically performed with the face, 
arm or leg, neurons processing the word and those processing the action described by 
that word frequently fire together and thus become more strongly linked. As a result, 
word-related networks overlap with motor and premotor cortex in a somatotopic fash- 
ion (Pulvermiiller 1999). On this semantic somatotopy view of meaning, being able to 
simulate executing an action of the type denoted by the verb is constitutive of the verb’s 
meaning. 

Empirical evidence for theories of grounded (or embodied) cognition comes from 
neuroimaging studies using FMRI or ERP. When action words are processed, there is 
effector-specific activation of motor areas that is somatotopically organized. For exam- 
ple, a leg-related word like kick activates dorsal areas, where leg actions are represented 
and processed, whereas arm-related words such as pick or face-related words such as 
lick activate lateral or inferior frontal motor areas, respectively. Similarly, when read- 
ing or viewing the noun hammer, the hand and not the foot area of the motor system 
is activated. 

Such theories of embodied cognition make a number of empirically testable predic- 
tions: (i) understanding an action verb and imagining performing that same action rely 
on the same neural tissue, in particular premotor cortex (Willems et al. 2009: 2388), 
(ii) understanding action verbs is primarily based on early, modality-specific, sensory- 
motor brain regions (Bedny and Caramazza 2011: 82) and (iii) these sensory-motor brain 
regions are automatically engaged during word comprehension (Bedny and Caramazza 
2011: 82). 

The first problem for theories of grounded cognition is that many neuroimaging 


studies failed to observe any increased activity for action-verbs anywhere in the motor 
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system (Bedny and Caramazza 2011: 87). A notable exception is the study by Willems et 
al. (2009). In an fMRI study they examined whether implicit stimulations of actions dur- 
ing language understanding involve the same cortical motor regions as explicit motor 
imagery. The participants were presented with verbs that are either related to actions 
that are usually executed with the hand, like throw, or with verbs that are not related 
to this body part, like kneel. In order to control for spurious activation due to explicit 
imagery, there were two different tasks: participants either read the verbs (lexical de- 
cision task LD) or they actively imagined performing the actions denoted by these verbs 
(imagery task IM). Contrary to earlier results, they found a double dissociation. Primary 
motor cortex showed effector-specific activation during imagery, but not during the lex- 
ical decision task. For the premotor area they found out that there was effector-specific 
activation that distinguished between manual and non-manual verbs, both in LD and 
in IM. But importantly, there was no overlap or correlation between regions activated 
during the two tasks. More precisely, portions of BA6 and BA4 that were defined on 
the basis of effector-specific activity during the IM task showed no such activity during 
LD. Similarly, regions in BA4 and BA6 that showed effector-specific activity during LD 
showed no such activity during IM. The authors conclude: “These double dissociations 
show that implicit motor simulation and explicit motor imagery do not necessarily en- 
gage the same neural tissues in premotor and primary motor cortices and by inference 
may not include the same cognitive processes” (Willems et al. 2009: 2396). 

Similar to the Willems et al. study, Postle et al. (2008) found effector-specific ac- 
tivity in premotor cortex only when participants viewed actions performed with hand, 
arm or foot. By contrast, when they silently read the corresponding verbs, there was 
only activation in premotor cortices. Importantly, premotor leg, arm and hand areas 
responded to all action-verbs in the same way, i.e. there was no somatotopical reac- 
tion. In addition, several of these premotor areas also responded to nouns and even 
non-words. These results constitute strong evidence against prediction (i) i. e. that un- 
derstanding action verbs and imagining performing those actions rely on the same, or 


at least overlapping, neural tissues. Summarizing, one gets the following correlations: 


. Primary motor cortex is active during motor imagery; during processing of action 


verbs this cortex is not active, provided no corresponding instructions are given. 


. Premotor cortex areas are active during comprehension of action verbs; however, 
there is no overlap with areas in this cortex that are active during explicit imagery. 


In addition, there need be no effector-specific activity. 
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According to Bedny and Caramazza (2011: 87), results like the above raise the im- 
portant question of whether such activity in left premotor areas is specific to action 
verb comprehension or whether this activity rather reflects a more general contribu- 
tion of premotor cortex to language. Evidence for such a more general contribution 
comes from several studies. Graziano (2006) showed that activity in premotor areas is 
more sensitive to the behavioral context and possible goals and results brought about 
by an action.” Schluter et al. (1998) found that premotor cortex is involved in higher- 
order aspects of movement like sequencing and movement selection. Similarly, this 
cortex is involved in planning and predicting actions and sequentially structured events 
(Schubotz and von Cramon 2004). When taken together, one gets that the premotor 
cortex shares features with adjacent prefrontal cortex (Miller and Cohen 2001). 

Evidence against prediction (ii) comes from studies involving the middle temporal 
gyrus (MTG). There is more activity in MTG when participants generate action verbs 
than when they generate color names for visually presented nouns. MTG is more active 
when action verbs are processed compared to the processing of nouns for concrete 
objects and color adjectives. Furthermore, MTG response is equally high with action 
verbs like run and mental state verbs like think and it is equally low for nouns denoting 
animals like tigers which are rich in motion features and nouns like rock which are low 
in motion features. In addition, MTG responds more to verbs like give compared to 
verbs like run. This area responds to action verbs in the absence of a sentence context. 
Representations are neither visual nor motion related and regions in MTG that are 
activated during processing of action verbs do not overlap with visual-motion regions. 
Bedny and Caramazza (2011: 91) conclude that “these results argue that the MTG stores 
modality-independent representations that encode conceptual rather than perceptual 
properties. ... Together, these results suggest that the MTG represents conceptual 
information about events or meaning-relevant grammatical information about verbs.” 

A key question with respect to prediction (iii) is: Do effector-specific activations 
show that they are used by speakers to semantically analyze the word or the words 
in a sentence? As first noted in Postle et al. (2008), this need not be the case. The 
motor activation can be an epiphenomenon of processing the word or the constituents 
in the sentence. The speaker semantically analyzes the expressions and simultaneously 


or subsequently (s)he mentally imagines executing a corresponding action or event. As 


2 This example as well as the following ones are taken from Bedny and Caramazza (2011). 


3 For details on the following, see the discussion in Bedny and Caramazza (2011) as well as the references 
cited therein. 
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noted by Bedny and Caramazza (2011: 83), language-perception interactions need not 
result because action-verb meanings are represented but rather because verb meaning 
representations prime visual motion representations during contemporaneous linguis- 


tic and perceptual tasks.* 


2 Action, events and the dynamic structure of action verbs 


When viewed from a linguistic, in particular semantic, viewpoint, a general weakness 
of most studies involving action verbs consists in the restriction to test isolated verb 
forms, in general infinitive forms like kick or throw.” However, what type of action or 
event is denoted by an expression, say a sentence, in which an action verb occurs, not 
only depends on the verb but also on its arguments and their semantic (or referential) 
properties. Consider, for instance, the German examples in (1). 
(1) a. Hans lief (stundenlang im Park herum). 

b. Hans lief zum Bahnhof. 

c. Hans lief durch den Park. 


d. Hans lief zu Hochform auf. 


Example (1a) is an activity expression admitting of modification with a for- but not 
with an in-adverbial. It describes an action as unbounded in the sense that no particular 
goal (say a destination to be reached) is specified.® By contrast, example (1b) describes a 
running that has an explicit goal: the station. The action is therefore bounded by this 
destination. Linguistically, this is reflected by the admissibility of modification with 
in-adverbials but not of that with for-adverbials. Example (1c) can be taken to either 
describe an unbounded or a bounded event. In the first case it corresponds to (1a) (Hans 
ran across the park), whereas in the second case it corresponds to the English translation 


Hans crossed the park. The last example differs from the preceding ones. Here laufen 


4 As noted by Willems et al. (2009: 2398), another reason why there is effector-specific activity in motor 


areas can be due to the fact that participants in those studies were not prevented from forming mental 
images. Furthermore, Postle et al. (2008) note that the positive results can be artifacts of differences in 
imageability between critical and control stimuli. For example, in the Hauk et al. (2004) study, action verbs 
were compared to hash-marks as lower-level control. As a result, effector-specific activity could have been 
triggered by increased imagery to concrete action language as compared with more abstract language (see 
also Willems et al. 2009: 2398). 


aw 


This limitation becomes even more apparent in languages like Dutch or German where the infinitive form 
is in general distinct from tensed forms, whereas in English the infinitive coincides with the present tense 
form. 


This does not mean that Hans didn’t have a particular destination in mind; for example, the university 
which he was running to. 
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is used in an idiomatic and not in its literal sense. (1d) does not necessarily describe 
an event which involves a particular motor program involving the legs. For example, it 
can be used in a situation where Hans did a great job in convincing the audience during 
a talk he gave at the university.’ 

In order to explain these differences one has to take into consideration that events 
occur in time, in contrast to ‘normal’ objects like tables and trees which persist in time.’ 
Furthermore, action and events have a particular temporal-causal or dynamic structure. 
This structure can be described in terms of a nucleus structure in the sense of Moens 
and Steedman (1988), which consists of a linearly ordered sequence of constituents or 
parts: a development process (DP), a culmination (Cul) and a consequent state (CS) (in 
Figure 1 o:(e) and B(e) are the beginning and end point of the event e, respectively). 

The important point is that the examples in (1) describe different nuclei structures. 
The nucleus structure for (1a) consists of a DP only because no destination, and there- 
fore no CS (be at the destination) is specified. For (1b) the nucleus structure is the one 
depicted in Figure 1. Here a destination is determined together with the CS Hans is at 
the station. (1c) has two corresponding nuclei structures, i.e. those of (1a) and (1b). 
These examples already make clear that a nucleus structure is underspecified in at least 
two respects if only the verb, say laufen, is taken into consideration. First, the sort, or 
type, of a possible goal is not (yet) determined. Second, the exact way in which the 
running is executed is not (yet) determined. The two kinds of underspecification are 


not unrelated. Consider the examples in (2). 


culmination 


development process consequent state 


| | 
a(e) B(e) 


e 


Figure 1: Nucleus structure for bounded processes bringing about a result 


(2) a. Bill grasped the needle. 
b. Bill grasped the barbell. 


The way the grasping is executed depends on the object that is grasped. As noted by 


Willems et al. (2009: 2307), very different action plans are necessary to successfully ex- 


7 Though this example can also be used to describe a perfect 100 m performance by Hans in athletics. 


8 Thus, for each time slice of a ‘normal’ object one always gets the complete object. By contrast, for actions 
and events one usually only gets a proper part. 
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ecute the two actions described by the sentences in (2). Similarly, throwing a Frisbee or 
a baseball requires different grips and different arm motions. These examples show that 
the sortal information provided by the direct object is, at least in general, important to 
resolve the underspecification with respect to the exact motor program to be executed. 
To make the fact of different nuclei structures determined by the same verb clearer, let 


us consider another set of examples involving the verb kick. 


(3) a. John kicked Bill. 
b. John kicked Bill several times. 
c. John kicked the ball into the goal. 
d. John kicked the bucket. 


Example (3a) can be used to describe a single (atomic) kicking, the corresponding 
nucleus structure of which consists of a Cul (without a CS, see Moens and Steedman 
1988 and Naumann 2001 for details): NScy. A sequence of such atomic kickings is 
described by (3b): NScux. The nucleus structure is complex because it consists of a 
sequence of nuclei structures having a Cul only. Sentence (3c) describes an event in 
which the kicking of the ball causes the latter’s location to change: before the kicking it 
was not in the goal whereas it is in the goal as an effect of the kicking. In this case, two 
nuclei structures are related by a causal relation. The first nucleus structure consists 
of a Cul describing the kicking proper and the second is a nucleus structure consisting 
of a DP, a Cul and a CS describing the movement of the ball into the goal: NS, CAUSE 
NS». For (3d), the situation is different. In this sentence, kick is not used in its literal 
sense but it is used idiomatically. Since kick the bucket means die, the nucleus structure 
consists of a Cul together with a CS (be dead). 

Reconsidering the examples in (3), one gets: after processing John kicked, which 
is common to all four sentences, a comprehender cannot (yet) know which of the 
four nuclei structures is described by the sentence. However, using linguistic knowl- 
edge/experience (e.g. frequency information) as well as world knowledge (what type 
of nucleus structure occurs most often in the context of a kicking), (s)he has a particular 
expectation about which nucleus structure is most likely be described. For example, the 
literal (non-idiomatic) uses are in general more expected than the idiomatic sense in 
(3d).” For the literal uses, a possible ordering can be NScu < NS; CAUSE NS» < NScutx; 


i.e. single kickings are most expected, followed by kickings that are used to obtain a 


° However, in a context in which it is clear that John is going to die, (3d) can be the most expected contin- 
uation. 
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particular effect and sequences of atomic kickings are least expected.” In a particular 
context, this default ordering has to be changed. For example, upon listening to... the 
ball into the goal after processing John kicked... a comprehender comes to know that 
the kicking had a destination and that therefore this sentence isn’t used to describe an 
action of the most expected nucleus structure (Cul) but of type NS; CAUSE NS. As a 


consequence, a less expected nucleus structure has to be chosen. 


3 Interpreting action verbs in the brain 


In our account, understanding the meaning of an action verb is in part determined by 
knowledge of (i) the set of possible nuclei structures which describe possible temporal- 
causal evolutions of actions and events denoted by the verb and (ii) the default ranking 
among the elements of this set. This information about the meaning of a verb is stored 
in MTG. 

This knowledge is only necessary for grasping the (complete) meaning of such a 
verb because verbs with identical sets of nuclei structures and default ranking would 


have the same meaning." 


However, they differ with respect to implicit simulations 
in premotor cortex in the sense of Willems et al. (2009). Implicit simulations are 
pre-enactments of potential future experiences, the principal function of which is the 
ability to make predictions about how exactly an event will evolve and what its possible 
consequences are. For example, a word like grasp can serve as a cue to activate neural 
circuits involved in partial preparation of an action of grasping something. As noted 
by the authors: “This schematic, unconscious, prospective activation of effector-specific 
regions in premotor cortex presumably facilitates further action planning if subsequent 
cues call for grasping to be executed or to be imagined explicitly” (Willems et al. 2009: 
2388). 

Linguistically, the ranked set of nuclei structures corresponds to the level of verbs 


in the lexicon. Conceptually, it can be taken as a symbolic, amodal representation of the 


10 Again, it must be stressed that this ordering is to be determined empirically and that it is in general - at 
least in part - context dependent. For example, in case of a penalty kick during a football match, NS; 
CAUSE NS; is likely to be most expected. 


11 But see below for a refinement of this thesis. 


12 By contrast, explicit imagery is covert enactment of an action. Like overt motor execution, motor imagery 
may entail the generation of an action plan (inverse model) as well as a prediction of the action’s sensory 
consequences (Willems et al. 2009: 2388). Its principle function is either reflective (i. e. covert reenactment 
of prior actions) or prospective (e. g. an athlete usually imagines the concrete motor program before starting 
his performance). 
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concept expressed by the verb that is independent of sensory and motor simulations. By 
contrast, implicit simulations correspond to projections of the verb like VP or sentences. 
To be precise, implicit simulations are triggered when a comprehender has enough 
information to determine a specific way or manner in which the action is executed. 
As shown in the previous section, this is in general the case if (s)he knows which 
object undergoes the change brought about by the action. Thus, implicit simulation 
corresponds to the choice of an appropriate activity, modulo the direct object of the 
verb. 

When taken together, the meaning of a verb consists of two dimensions: a symbolic, 
amodal dimension and different ways in which these representations can be grounded 


to specific activities that are undertaken in a particular situation. 


Dimension | Level of Reference Neural Function Linguistic Level 
Abstraction Correlate 
conceptual | symbolic and | ranked set of MTG determination of possible | (isolated) verb in 
amodal nuclei evolutions in terms of a the lexicon 
structures temporal-causal structure 
implicit grounded instantiated regions in prediction and planning projections of the 
simulation nuclei premotor (preenactment of actions) | verb (VP and S$) 
structures cortex 


At the conceptual dimension actions and events are taken as types (or schemes), 
whereas at the second dimension these types are instantiated in a particular situation 
in space and time, yielding an action or event token. This differentiation has the ad- 
vantage of computational economy since it leads to a reduction on the requirement on 
storage. Different nuclei structures can be instantiated (or grounded) to various situa- 
tions belonging to different action types. One has a small number of abstract, symbolic 
and amodal temporal-causal structures (nuclei structures) that can be instantiated in 
an indefinite number of concrete situations in space and time. In particular, a nucleus 
structure of a particular type, say the one depicted in Figure 1 consisting of a DP Cul 
CS, can be used for (i) different action verbs and (ii) different instantiations of the same 
type of action. An example for (i) are verbs like eat and run. Both eat an apple and run 
to the station are of type DP Cul CS. They differ with respect to (i) the place in the de- 
fault ordering and (ii) the types of possible activities that can instantiate this structure. 
Whereas this nucleus structure is the most expected one for eat, this does not hold for 
run, which basically describes unbounded actions with no particular goal or destination. 


For eat, appropriate activities include putting food into the mouth using the hands, a 
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fork or a spoon or, in the case of an animal, the lips and the tongue. By contrast, for 
running events appropriate activities are fast movements typically involving the legs. 

If a verb is encountered, the set of possible nuclei structures in middle temporal 
gyrus is activated. In the absence of further information, a comprehender assumes that 
an event corresponding to the most expected nucleus structure (or the most expected 
nuclei structures) is (are) described. Accessing verb meanings therefore involves ac- 
cessing the corresponding nuclei structures. The more complex a nucleus structure, the 
longer the time to access and/or activate this structure. Thus, there is a cost in pro- 
cessing time that depends on the complexity of the nucleus structure. For example, the 
most expected nucleus structure for an activity verb like run is of type DP. By contrast, 
for a verb like give, which expresses a causal relation involving two different nuclei 
structures, the most expected nucleus structure is more complex.” 

The activation of the ranked set of nuclei structures does involve no immediate ac- 
tivation of premotor or primary motor areas since no particular implicit or explicit sim- 
ulation can yet be determined because the choice depends on the argument denoting 
the object undergoing the change as well on the actor executing the action.'* Rather, 
premotor areas related to implicit stimulations are activated only after the nuclei struc- 
tures are instantiated. As noted above, this is the case for projections of the verb, in 


particular the VP and the sentence level. 


3.1 Empirical evidence for our approach 


From what has been said so far, the following predictions can be derived from our 


approach: 


. There is only weak activation of primary and premotor areas upon processing of 
the verb. Activation of the motor system is possible only if the underspecification 
inherent in a nucleus structure has been removed. This is in general possible only 


if the type of the object undergoing the change is known. 


13 NS, : DP (action undertaken by the actor); NS2 : Cul CS (the recipient gets the theme). 


14 Though this does not exclude the possibility that a comprehender activates a particular simulation in- 
tentionally or by convention. For example, a football player or a football fan might usually immediately 
engage in triggering simulations of a player kicking a football upon hearing or reading the verb kick. But 
such simulations are independent of understanding the meaning of the verb or the sentence in which it 
occurs. Rather, the meaning of the verb primes particular sorts of motor programs that can be used in 
executing the action or event type. 
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. Sentences with an idiomatic sense elicit stronger activation in MTG because a 
less expected nucleus structure must be chosen. This reordering triggers a higher 
processing load reflected by a stronger activation in MTG. 

. Complex nuclei structures trigger stronger activation because e. g. different types 
of nuclei structures must be related to each other (e. g. in a causal relation). The 
general rule is: the more complex a nucleus structure, the stronger the activation. 

. Implicit simulation depends on the expertise of the comprehender. For example, 
both experts (players and fans) and laymen understand sentences about hockey 
matches. However, players and fans are better able to implicitly simulate actions 
undertaken during a game. Thus, one expects the same activation in MTG but 


differences with respect to premotor activity. 


Evidence for the truth of the first two predictions comes from an fMRI study by 
Boulenger et al. (2008). They examined how literal versus idiomatic sentences with 
action verbs referring either to the leg (kick) or the arm (grasp) are processed in the 


brain. 


(4) a. He kicked the ball. 
b. He kicked the bucket. 
(5) a. He grasped the needle. 
b. He grasped the idea. 


Brain activity was measured at the onset of the critical word in the sentence (He 
grasped the IDEA) which disambiguated between a literal and an idiomatic reading 
(early analysis window) and three seconds after its end (late analysis window). They 
found that (i) a common network of cortical activity was triggered for both conditions 
in both analysis windows, with the idioms eliciting overall more distributed activity; 
(ii) primary and premotor cortices were activated both for idioms and non-idioms; (iii) 
activation of (frontocentral) primary and premotor areas was relatively weak both at 
action verb onset (and therefore upon processing the action verb) and at the onset 
of the critical word. However, it was strong after the offset of the critical word both 
for literal and idiomatic readings; (iv) sentences with literal meanings failed to elicit 
stronger activation than sentences with an idiomatic reading in any brain area; (v) in 
the late analysis, window cortical activity was greater in MTG and the cerebellum. 

In the present context findings (iii) and (v) are the most important ones. Finding (iii) 


shows that there is no instant spreading of activation to primary or premotor cortex 


15 Furthermore, there was stronger activation of idioms in inferior frontal gyrus in both windows. 
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during action verb processing. Rather, this activation is delayed until after the direct 
object has been processed. This is in contrast to the results for processing isolated 
action verbs. Finding (v) can be taken as providing evidence for our claim that in case a 
verb is used in an idiomatic sense the default ordering on the set of nuclei structures 
must be changed (i.e. there is a reordering of the elements of this set), resulting in a 
higher processing load, reflected in the higher activity in MTG." 

Evidence for the third prediction comes from two studies by Shetreet et al. (2007) 
and Van Dam and colleagues (2010), respectively. Shetreet and colleagues found that 
MTG responds more strongly to sentences with verbs that have more arguments, even 
when the sentences have the same overall length. For example, processing John gave 
Mary the book (three arguments) triggers stronger activity in MTG than the sentence 
John ran to the station (two arguments). In our approach, a verb like give is related to a 
complex nucleus structure consisting of two substructures that are linked by a causal 
relation. The first nucleus structure describes the action undertaken by the giver (actor) 
whereas the second nucleus structure describes the event of the recipient receiving (and 
thereby coming to possess) the theme, i.e. the object given. Van Dam and colleagues 
(2010) found that the processing of action verbs like wipe that denote events describing 
a particular way of moving part of the body triggers stronger inferior parietal activity 
than verbs like clean for which no such manner is determined. This finding can be 
explained as follows. Levin and Rappaport-Hovav (to appear) distinguish between verbs 
of manner and verbs of result. Manner verbs specify a particular way in which an 
action is executed. For example, wipe and brush determine a particular way of cleaning 
an object without imposing the constraint that the result be attained at the end of the 
event. By contrast, result verbs specify a particular end state of the action. For example, 
clean requires the object undergoing the change, say a table, to be clean as a result of 
the cleaning activity undertaken by the actor. However, no specific type of activity (or 
manner) by which this end state is achieved is determined by the verb. In our approach, 
manner verbs like wipe have a most expected nucleus structure of type DP, i.e. they 
are basically activity verbs that are usually used to describe unbounded events which 
need not bring about a particular result (similar to a verb like run). By contrast, a result 
verb like clean has a most expected nucleus structure of type DP Cul CS. However, for 
clean only the culmination is explicitly determined (the object has to be clean) but no 


particular activity. 


16 For details on how such orderings can be changed, see Naumann (2011, 2013, 2014). 
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In our approach, this means that the interpretation of a result verb is already de- 
termined by the ranked set of nuclei structures since the only constraint is the one 
imposed on the end state (be clean), which is already specified at the lexical level and 
which therefore is independent of the object undergoing the change. As a consequence, 
being able to implicitly simulate how the action can be executed is not part of the mean- 
ing of the verb. From this it does not follow that a comprehender does not engage in an 
implicit simulation (and, additionally, in explicit imagery). But in this case, (s)he plans 
or imagines an execution that can be described by another verb, say wipe as in wipe the 
table clean. 

Further evidence for our analysis comes from a study by McKoon and Macfarland 
(2000). They showed that there are no differences in processing time between transitive 


and intransitive uses of so-called externally caused event verbs like break and awake. 


(6) a. The fire alarm awoke the residents. 


b. The residents awoke. 


By contrast, for internally caused event verbs like bloom and wilt, processing times 


are significantly shorter than those for externally caused event verbs. 


(7) a. The bright sun wilted the roses. 
b. The roses wilted. 


Again, there are no differences between the transitive and the intransitive form. 
These results therefore show that the processing time depends on the type of the (pre- 
ferred or most expected) nucleus structure. Furthermore, these examples show that the 
cost in processing time is independent of the exact syntactic realization (transitive vs. 
intransitive). Rather, it only depends on the corresponding types of nuclei structures. 

Similar results were obtained by Gennari and Poeppel (2003). They showed that 
processing non-stative verbs like vanish and solve takes longer than processing stative 
verbs like love and exist (about 25 ms), even if the argument structures are identical (e. g. 
exist and vanish). 

Evidence for the fourth prediction comes from a study by Beilock and colleagues 
(2008). They let hockey players, hockey fans and hockey novices listen to sentences 
about hockey-related actions. They found that both for hockey players and hockey 
fans there was an increased activity in dorsal premotor cortex compared to the activity 
in this area for hockey novices. Furthermore, this stronger activity was influenced by 
experience with hockey games but not necessarily by motor experience directly related 


to playing the sport. For example, dorsal activity was the same for hockey players and 
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hockey fans. In addition, only for hockey novices the primary sensory-motor cortices 
were active and increased primary sensory-motor activity correlated negatively with 
action sentence comprehension. 

The above empirical results can be taken as evidence for the following two hypothe- 
ses: (i) result verbs are not directly related to particular implicit simulations or motor 
programs and (ii) for result verbs, grounding of a corresponding nucleus structure is, at 
least in part, independent of their types. By contrast, manner verbs require (i) activation 
of the related ranked set of nuclei structures in MTG and (ii) an implicit simulation in 
premotor cortex (in order to distinguish say brush from wipe). These hypotheses raise 
the following questions: (i) what is the exact relation between the ranked set of nuclei 
structures and implicit simulations? And (ii) where is this relation stored in the brain 
(i.e. what is the neuronal correlate of this relation)? One answer to the first question 
is that the ranked set of nuclei structures for a verb in MTG primes certain implicit 
simulations in premotor cortex. To be more precise: both manner and result verbs are 
related to a set of appropriate activities. Information about these activities is stored in 
regions of premotor cortex. For manner verbs this set is more restricted than that for 
result verbs. Furthermore, and more importantly, the set of activities for manner verbs 
is ranked in the sense that not all elements in this set are equally expected. By contrast, 
for result verbs there is no ranking on this set. For example, for wipe, one has rub with 
a cloth or one’s hand and for brush, rub with a brush. The set of appropriate activities 
for clean comprises those for wipe and brush (and those for other manner verbs which 
denote actions for cleaning something). A possible answer to the second question goes 
along Hebbian lines. Neuronal correlation is mapped onto connection strength. If an 
action verb frequently co-occurs with body movements that are executions of an ac- 
tion of the type denoted by the verb, this strengthens the connection between regions 
in MTG and regions in premotor cortex. There remain, of course, a number of open 
empirical questions, for example: Where in the brain is the ‘meaning assembly’ between 
a verb and its arguments located, i. e. what is the exact relation between verbal (dynamic) 
and non-verbal (static) meanings? and How is the ranked set of nuclei structures acquired 
in the brain during language learning? 

Furthermore, the above results also show that the various dimensions are not inde- 
pendent of each other. When taken together, the findings of the empirical studies used 
in this article suggest the following relation. Both implicit and explicit simulations are 
functionally or causally dependent on the conceptual domain consisting of the ranked 


set of nuclei structures in MTG. Empirical evidence supporting this claim is: (i) MTG 
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responds to verbs in isolation for sentences with transitive verbs and (ii) the motor sys- 
tem is activated only after the direct object has been processed. Thus, when processing 
a verb, regions in MTG are activated but no effector-specific activity in the motor sys- 
tem is (yet) triggered. Consequently, MTG is activated prior to the motor system. By 
itself, this temporal relation does not show that there is a functional or causal relation 
between those dimensions. However, both types of activity are directly related to pro- 
cessing the verb and therefore to understanding its meaning, which makes it likely that 
some functional relation is involved. Of course, this claim needs to be confirmed by 
further empirical investigations. 

Finally, an important empirical question is this: is the ability to trigger implicit sim- 
ulations in premotor cortex constitutive of grasping the meaning of (or to have the 
concept corresponding to) an action verb? In our approach the answer is negative for 
the following reason. The two dimensions in the meaning of an action verb correspond 
to different functions language and cognition have. The conceptual dimension is related 
to naming and recognizing objects of the given type. Evidence for this comes from stud- 
ies of patients suffering from apraxia as well as from the discussion of the results about 
hockey obtained by Beilock and colleagues. This dimension is non-goal oriented in the 
sense that no implicit preenactment of a possible execution is involved.!’ The second 
dimension, i.e. implicit simulation, is related to reflecting, predicting and planning an 
action of the given type by selecting appropriate activities and inferring future conse- 
quences of executing this action. Possible questions are: How can the goal be reached?, 
What is an appropriate activity to reach the goal or to execute the action? and What are 
possible consequences of executing the action? This dimension therefore is goal-oriented 
at a theoretical level (i.e. it does not involve the ability to execute a motor program). 
This ability is a necessary condition for being able to attain a goal or result by executing 
an action of the given type. For example, in the case of eating one can use the hands or, 
alternatively, a fork and a knife. By contrast, explicit imagery corresponds to the ability 


1.18 


of actually executing a motor program to attain the goal. ° The inability to have implicit 


simulations impairs a speaker for this particular function. This is the case for patients 


17 Though it may involve naming the goal of a possible execution, e.g. making an object clean for the verb 
clean since involving a goal (Cul) is part of the most expected nucleus structure of this verb. 


18 Additional evidence for this analysis comes from studies of apraxia, i.e. the inability to perform particular 
activities as a result of brain damage. People suffering from this inability are impaired for using objects of a 
particular kind, say a hammer, though they are unimpaired for (i) naming those objects and (ii) recognizing 
pantomimes associated with uses of those objects. Thus, integrity of motor processes is not necessary in 
order for object naming and action recognition to be in the normal range; see Mahon and Caramazza (2008) 
for details. 
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suffering from apraxia. However, the Beilock et al. study shows that that the ability 
to have implicit simulations comes in degrees. Hockey novices do have activity in the 
motor system, though it is less strong than the activity triggered in hockey players and 
hockey fans.” 


3.2 Comparison to theories of grounded cognition 


Our approach differs from theories of grounded cognition in the following respects. 
First, ‘automatic activation’ does not mean that the motor system is immediately ac- 
tivated when a verb is processed in the brain, i.e. that linguistically processed input 
immediately results in activation of the motor and sensory systems. Rather, what is 
immediately activated is the ranked set of nuclei structures. Groundedness is not an 
attribute of the verb proper but rather a property of its projections like VP or S. The 
reason for this is that the conceptual level stored in MTG is impoverished in the sense 
that verbs which have the same ranked set of nuclei structures cannot be distinguished. 
This distinction is only made if a nucleues structure is instantiated. The neural correlate 
of this instantiation is an implicit simulation in premotor cortex. 

It may seem that this view is contradicted by the results of Hauk et al. (2004) and 
others showing that the motor system is activated rather quickly. Recall that Hauk 
et al. found that when presented with the word kick the ‘leg’ region of the motor 
system is activated within a time span of about 200 ms. Yet those results do not provide 
counterevidence to our claims. First, those results were obtained for isolated verbs and 
not for sentences in which these verbs occur as a constituent and this fact was known 
to the participants. When taken in isolation, a verb like kick is interpreted by uniquely 
describing a nucleus structure consisting only of a Cul because a comprehender already 
knows that no further information, say about a goal of the kicking, is added, which may 
make it necessary to change the nucleus structure to one of type Cul CAUSE NS3. 

A second difference is that in our approach, following Willems et al. (2009), a distinc- 
tion is made between implicit simulations and explicit imagery. Third, explicit imagery 
is an epiphenomenon of processing (and thereby understanding the meaning) of the 
verb. As pointed out in the previous section, a verb (or its corresponding ranked set 
of nuclei structures) primes certain ways in which an action denoted by the verb is exe- 


cuted. As a result, an implicit simulation can be triggered. This way of undertaking the 


1° However, it remains an open empirical question of whether this activity is related to both implicit simu- 
lation and explicit imagery or to only one of those activities in the motor system. 
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action may subsequently result in explicit imagery of the corresponding action. Fourth, 
and most importantly, a distinction between a symbolic and amodal dimension and a 
grounded dimension is made in the definition of the meaning of an action verb. 

Summarizing, one can say that theories of grounded cognition only capture one par- 
ticular dimension of a verb’s meaning, i.e. that related to the motor system. However, 
they usually do not distinguish between implicit simulations and explicit imagery. In 
addition, if it is true that both of these activities in the motor system are functionally 
and causally dependent on a conceptual dimension, they fail to give a satisfactory ac- 
count of how meanings are represented and accessed in the human brain. This failure is 
in large part due to the fact that most often only isolated verbs and not larger linguistic 
contexts, like sentences, in which those verbs occur are considered. 

Another way of comparing theories of grounded cognition and ours is the following. 
Mahon and Caramazza (2008) distinguish four possibilities of how the motor system 


can be related to a conceptual dimension. 


1. Processing the verb directly activates the motor system, with no intervening ac- 


cess to abstract conceptual content. 


2. Processing the verb directly activates the motor system and in parallel activates 


abstract conceptual content. 


3. Processing the verb directly activates the motor system and then subsequently 


activates an abstract conceptual representation. 


4. Processing the verb directly activates an abstract conceptual representation and 


then activates the motor system. 


Only on the fourth possibility is the conceptual dimension activated before the motor 
system, whereas in the other three possibilities the motor system is either independent 
of the conceptual dimension (1), works in parallel with it (2) or there is a cascading flow 
of information from the motor system to the conceptual dimension (3). The first three 
possibilities underlie the various forms of theories of grounded cognition: The motor 
system is never activated after the conceptual system (provided the latter is assumed 
at all). Our approach is characterized by the fourth possibility. First, the ranked set of 
nuclei structures in MTG is activated and subsequently implicit simulations in specific 


premotor areas are triggered by a spreading activation. 
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4 Comparison to other approaches 


Similar to our approach, the grounding by interaction account proposed in Mahon and 
Caramazza (2008) distinguishes between an abstract or symbolic level of representa- 
tion and its instantiation (or grounding) in a particular situation. The symbolic level 
is conceptual and characterized by various output modalities like being able to name an 
object or action falling under the given concept or knowing something about the way it 
is built up or construed. For example, in the case of a hammer this conceptual knowl- 
edge possibly involves being able to recount the history of the hammer as an invention, 
the materials of which the first hammer was made, or what hammers typically weigh 
(Mahon and Caramazza 2008: 67 f.). This conceptual information can apply to diverse 
sensory modalities like touch, vision or audition. What is missing from this level is the 
interaction with the world. Conceptual information is not isolated. Rather, it can be 
activated by events in the world that are processed by the sensory system. As an effect, 
the conceptual information gets instantiated in a particular situation. The specific sen- 
sory and motor information that is activated may change depending on the situation in 
which the abstract conceptual information is instantiated (Mahon and Caramazza 2008: 
68). However, from this it does not follow that the sensory and motor information is 
constitutive of the concept. Rather, removing the sensory and motor system would re- 
sult in impoverished and isolated concepts. Thus, the activation of sensory and motor 
processes contributes to the ‘full’ representation of the concept. 

The approach presented here bears some similarity with constraint-satisfaction- 
based approaches, like that of Jurafsky (1996) for example. According to such accounts, 
the processing of a sentence first involves the activation of several possible interpreta- 
tions. These interpretations are ranked according to a probability measure that is based, 
among other factors, on the likelihood of a particular word being used in a particular 
context or the likelihood of a verb to be used with a particular meaning. For exam- 
ple, the noun nail refers either to a body part (fingernail, toenail) or a metal fastener. 
Processing this word therefore involves activation of brain areas related to both mean- 
ings of the word.” This set of possible interpretations is narrowed down when further 


information in the sentence is processed: The nail he used to put up the picture. 


20 According to Zwaan and Kaschak (2008), from which this example is taken, the processing involves the 
activation of traces or mental simulations that are relevant to both senses of the word, in accordance with 
the embodiment thesis. 
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5 Conclusion 


In this article we presented a theory of action verbs that combines an abstract, modality- 
independent component with a modality-specific component located in regions of pre- 
motor cortex. Semantically, this analysis is based on the observation that a verb like 
kick can be used to express different types of actions (kick/kick the ball/kick the ball 
into the goal) that differ with respect to parameters like telic/atelic, result/no_result or 
atomic/iteration. The conceptual information about events are the different types of 
nuclei structures and the meaning-relevant information about a verb is the ranked set 
of such structures that represents the conceptual dimension of its meaning. This infor- 
mation is amodal and concerns the temporal-causal structure of an action or event. It is 
stored in MTG, which has been shown to respond to the processing of verbs as opposed 
to nouns and adjectives. 

This temporal-causal structure is underspecified with respect to the exact way or 
manner (motor program) an action of a particular type is executed because this way 
depends on the object undergoing the change. After combining with the direct object of 
the verb, this structure is grounded or instantiated by a spreading activation to premotor 
cortex leading to an implicit simulation which makes it possible to derive additional 


conclusions about this structure. 


6 References 


Barsalou, L. W. (1999). Perceptual symbol systems, Behavioral and Brain Sciences, 22, 
577-660. 


Beilock, S. L. et al. (2008). Sports experience changes the neural processing of action 
language, Proceedings National Academy Sciences USA, 105, 13269-13273. 


Boulenger, V., O. Hauk and F. Pulvermiiller (2009). Grasping ideas with the motor sys- 


tem: semantic somatotopy in idiom comprehension, Cerebral Cortex, 19, 1905-1914. 


Buccino, G. L. Riggio, G. Melli, F. Binkofski, V. Gallese and G. Rizzolatti (2005). Listening 
to action related sentences modulates the activity of the motor system; a combined 
TMS and behavioral study, Cognitive Brain Research, 24, 355-363. 


Gallese, V. and G. Lakoff (2005). The brain’s concepts: the role of the sensory-motor 
system in conceptual knowledge, Cognitive Neuropsychology, 22, 455-479. 


Gennari, S. and D. Poeppel 2003. Processing correlates of lexical semantic complexity, 
Cognition, 89, B27-B41. 


127 


Ralf Naumann 


Graziano, M. 2006. The organization of behavioural representation in motor cortex, 
Annual Review of Neuroscience, 29, 105-134. 

Hauk, O., L. Johnsrude and F. Pulvermiiller (2004). Somatotopic representation of action 
words in human motor and premotor cortex, Neuroscience, 41, 301-307. 

Jurafsky, D. 1996. A probabilistic model of lexical and syntactic access and disambigua- 
tion, Cognitive Science, 20, 137-194. 

Levin, B. and M. Rappaport Hovav (to appear). Lexicalized meaning and manner/result 
complementarity, In: B. Arsenijevic, B. Gehrke, and R. Marin, (Eds.), Subatomic 
Semantics of Event Predicates, Springer, Dordrecht. 

Mahon, B. Z. and A. Caramazza (2008). A critical look at the embodied cognition hy- 
pothesis and a new proposal for grounding conceptual content, Physiology — Paris, 
102, 59-70. 

McKoon, G. and T. Macfarland 2000. Externally and internally caused change of state 
verbs, Language, 76, 833-858. 

Miller, E. K. and J. D. Cohen 2001. An integrative theory of prefrontal cortex function, 
Annual Review of Neuriscience, 24, 167-202. 

Moens, M. and M. Steedman (1988). Temporal ontology and temporal reference, Com- 
putational Linguistics, 14: 2, 15-28. 

Naumann, R. (2001). Aspects of changes: a dynamic event semantics. Semantics 18: 
27-81. 

Naumann R. (2011). Relating ERP-effects to theories of belief update and combining sys- 
tems. In: M. Aloni et al. (Eds.), Proc. 18th Amsterdam Colloquium, LNCS, Springer, 
Berlin. 

Naumann, R. (2013). Outline of a dynamic theory of frames. In: G. Bezhanishvili, S. 
Lobner, V. Marra, and F. Richter (Eds.), Proceedings of the 9th International Tbilisi 
Symposium on Language, Logic and Computation, Volume 7758 of Lecture Notes in 
Computer Science, pp. 115-137. Springer Berlin Heidelberg. 

Naumann, R. (2014). A dynamic update model of sentence processing, ms, University of 
Diisseldorf. 

Postle, N. et al. 2008. Action word meaning representations in cytoarchitectonically 
defined primary and premotor cortex, Neuroimage, 43, 634-644. 

Pulvermiiller, F. (1999). Words in the brain’s language, Behavioral Brain Science, 22, 
253-336. 

Pulvermiiller, F. (2005). Brain mechanisms linking language and action, Nature Reviews 
Neuroscience, 6, 576-582. 


128 


Dynamics in the Brain and Dynamic Frame 


Schluter N. D. et al. 1998. Temporal interference in human lateral premotor cortex 
suggests dominance for the selection of movements. A study using transcranial 
magnetic stimulation, Brain, 121, 785-799. 

Schubotyz, R. I. and D. Y. von Cramon 2004. Sequences of abstract neurobiological stimuli 
share ventral premotor cortex with action observation and imagery, Neuroscience, 
24, 5467-5474. 

Shetreet, E. et al. (2007). Cortical representation of verb processing in sentence compre- 
hension: number of complements, subcategorization and thematic frames, Cerebral 
Cortex, 17, 1958-1969. 

Van Dam, W. O. et al. (2010). How specifically are action verbs represented in the neural 
motor system: an fMRI study, Neuroimage, 53, 1318-1325. 

Willems, R. M. et al. 2009. Neural dissociations between action verb understanding and 
motor imagery, Cognitive Neuroscience, 22, 2387-2400. 

Zwaan, R. and M. Kaschak (2008). Language in the brain, body, and world, In: M. Robben 
and M. Aydede (Eds.) Cambridge handbook of situated cognition, Cambridge: CUP, 
368-381. 


129 


SANDER LESTRADE 


The place of Place (according to spatial case) 


Sander Lestrade s.lestrade@let.ru.nl 
Radboud University Nijmegen 
Centre for Language Studies 


Abstract 

This paper addresses the question whether we should analyze Place, expressing the ab- 
sence of a change of location, on a par with mode expressions specifying the type of such 
a change, i.e. Source and Goal. By cross-linguistic study of spatial case systems, various 
options of analysis are considered and illustrated. It is concluded that languages may 
differ in their spatial expression of Place, suggesting a non-uniform semantics and, pos- 
sibly, conceptualization. Also, it is proposed to view these various analyses as diachronic 
variants. 

Keywords: spatial language, Place, mode/directionality, morphological decomposition 


1 Introduction! 


If a moving entity is to be localized, it generally does not suffice to merely provide a 
location.” Instead, it needs to be made clear at which interval of the motion event this 
locatum can be found there. For this, mode expressions such as to and from can be used 
(mode is probably better known as directionality, a tradition that is not followed here for 
reasons explained in Lestrade 2011 and 2012). Mode expressions restrict the location of 
a locatum to a specific interval of the event only, for example to the end point (Goal) or 
to the starting point (Source) of the motion event. In the following example, the locatum 
John is said to be in the house at the end point of the walking event only by the mode 


expression -to: 
(1) John walked into the house. 


The question to be addressed in this paper is whether we should acknowledge Place, 
which would then locate the locatum to the location throughout the whole event, as a 


third distinction of mode on a par with Source and Goal. That is, should we think of 


1 I would like to thank an anonymous reviewer for comments and suggestions that helped to improve this 
paper. 

2 For original terminology and discussion, see Talmy (1990), Jackendoff (1983), Kracht (2002), Walchil and 
Zúñiga (2007), Levinson (2000), and Bateman et al. (2010). 
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mode as an obligatory dimension, defaulting to Place mode in the absence of motion, 
or rather as an optional dimension of spatial expressions that is only used if neces- 
sary, in combination with motion verbs (only distinguishing Source and Goal)? Before 
discussing this question in more detail, let us further agree on the terminology: The 
locations that mode assigns to some point in time are named regions expressed by the 
configuration function, for example ‘in’, ‘under’, and ‘between’. These locations are de- 
fined with respect to a reference object called the ground. In (1), the configuration is ‘in’ 
and the house is the ground, therefore the location is the inside of the house; with John 
being the locatum and the mode being Goal, John is said to be in the house at the end 
point of the walking event only. 

The reason to consider Place as a mode option, something that may seem unneces- 
sary from an English perspective, can be illustrated by the following part of the spatial 


case paradigm of Hungarian: 


(2) Partial Hungarian case paradigm 

hazon hazra hazral 

‘onto the house’ ‘on the house’ ‘off the house’ 

(superlative) (superessive) (superelative) 
Spatial expressions in Hungarian consistently come in three variants, one for Goal, one 
for Source, and one for Place (a term that necessarily remains without proper definition 
in this first part of the paper). This three-way distinction suggest that, morphologically 
at least, Place may be on a par with Goal and Source in some languages. But whereas 
analyses of mode all agree on accepting Goal and Source, they differ in whether they 
recognize Place as a distinction of mode too (Kracht 2002, 2008; Lestrade 2010, 2011) 
or analyze it as the absence of such a distinction instead (e. g., Jackendoff 1983, 1990; 
Zwarts 1997, 2005; Wunderlich 1991; Schank 1973). 

Intuitively, it could be argued both ways indeed. If mode is defined as restricting the 
scope of the location (of some locatum) to an interval either before or after a change 
of location, this function does not apply in the absence of such a change. On the 
other hand, mode could be argued to be an obligatory ingredient of spatial meaning 
and/or spatial expressions. In this case, the link between the location and the event 
time is always made, irrespective of whether they concern stative or motion events, 
and possibly by zero markers for specific modes for reasons of economy. (The use of 
zero markers is not as obscure a strategy as it may seem, cf. the use of zero markers 
for what is called nominative/absolutive case in many languages; de Hoop and Zwarts 


2010; Creissels 2010). Whereas Goal and Source temporally restrict a location to the end 
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or beginning of an event, Place mode in this view expresses that some location holds for 
the whole event. The two options are illustrated for English in (3) and (4), example (5) is 


given for contrast with an overt mode expression. 


(3) Place as the absence of mode (mode is optional) 

john is walking in the house. 

locatum V configuration:in ground 
(4) Place as a distinction of mode (mode is obligatory (and zero marked in English)) 

John is walking Ø in the house. 

locatum V mode:Place configuration:in ground 
(5) Goal mode (for contrast) 

The cat is coming from under the table. 

locatum V mode:Source configuration:under ground 

In fact, the choice is more complicated: It could be argued that there are three possi- 

bilities when barring Place from the mode domain. First, it could simply be the absence 
of mode as just illustrated in (3). Second, however, Place could be a generalized config- 
uration. In this case, it generalizes over all possible configurations, i. e. ‘in’, ‘under’, etc., 
expressing that although there necessarily is some configurational relation between the 
locatum and the ground in the world out there, its linguistic specification is deemed 
unnecessary (for example because its completely predictable, as is often the case with 
typical pairings such as between coffee cups and tables). Thirdly, the function of Place 
could be to change the named region referred to by the configuration into a predicate 
that establishes the link between a location and the locatum, for exampling changing ‘in 
the house’ into LOC(LocaTuM, IN THE HOUSE). This predicate may then subsequently be 
specified temporally by mode expressions if necessary. Under this analysis, Place is just 
another term for the locative function, a semantic function necessary for a composi- 
tional semantics of the spatial expression (cf. a.o. Creary, Gawron, & Nerbonne, 1989; 
Wunderlich, 1991; Zwarts, 1997; Kracht, 2002; Bateman 2010). The different options are 


illustrated in the abstract in the following examples? 


(6) Place as the absence of mode: 


[mode {Source, Goal} [configuration fin’, ‘under’, etc.} ] ] 


(7) Place as a generalized location: 


[mode {Source, Goal} [configuration (Place, ‘in’, ‘under’, etc.} ] ] 


3 Square brackets (“[]”) show the scope of the functions mentioned in the subscripts; curly brackets (“{}”) list 
the options a function has. 
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(8) Place as the locative function: 


[mode {Source, Goal} [locative function Place [configuration fin’, ‘under’, etc.} ] ] ] 


(9) Place as a distinction of mode: 


[mode {Source, Goal, Place} [configuration fin’, ‘under’, etc.} ] ] 


In the next section, these options will be illustrated with concrete cross-linguistic 
examples. Then also, it will be shown that it is not possible to decide between these 
options, or rather, that cross-linguistic data suggest that each of these analyses may 
be true for at least some languages. Accordingly, this paper will argue that although 
Place may not be a full-fledged distinction in the mode systems of all languages, our 
analysis of mode should at least leave open the possibility for Place to become one 
of its distinctions. Importantly for the topic of the present collection of papers, such 
different morphosyntactic behavior between languages bears on our account of the 
cognitive representation of spatial meaning: If the spatial systems of languages differ 
in fundamental ways, we may have to conclude that also our cognitive representation 


of space is not universal (cf. for example Levinson 1996 and Li and Gleitman 2002). 


2 Methodology 


To illustrate the different analyses above, we will make use of a method called morpho- 
logical decomposition. This method assumes a fair degree of compositionality between 
spatial expressions and spatial meaning: If some morpheme can be straightforwardly 
linked to a semantic function, its very use is taken as evidence for the existence of this 
function. In fact, we have already used this method in our examples above, suggesting 
that there is something as Source mode in English on the basis of the use of from. As the 
input for our decomposition exercise, we will consider a number of spatial case systems 
(for a more elaborate discussion of spatial case inventories and the motivation to use 
them in studies of spatial language, cf. Lestrade 2012). The reasoning goes as follows. If 
in a system of paradigmatic oppositions the markers of Place are at the same level as 
the markers of Goal and Source, we may want to conclude that Place semantically is 
on a par with Goal and Source too. That is, if Place is mutually exclusive with Goal 
and Source and all three may be added on top of configuration distinctions, we should 
probably analyze Place, Goal and Source alike as mode options. If, on the other hand, 
the markers for Goal and Source morphologically include the marker for Place, this 


suggests that Place is the input of Goal and Source semantically too. 
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By deconstructing the spatial expressions into their morphological parts in a num- 
ber of languages, it will be shown that there is some truth in each of the analyses, or, 
phrased less optimistically, that the evidence from the morphological decomposition of 
spatial case is not conclusive to decide once and for all which of the options should be 
considered the right one. But before we get there, it should be noted that there is an 
important caveat to this procedure. Morphological markers may be developed over and 
over again within a stable system of oppositions (Kiparsky 2012) and apparent inclusion 
relations may only be a coincidence. Therefore, evidence from this method should only 
be generalized if the results are consistent throughout the spatial expressions between 
or, depending on the range of the generalization, within languages. Secondly, the in- 
terpretation of the results partly depends on whether or not one accepts zero markers. 
Whereas zero expressions are wholeheartedly accepted by many linguists, they are at 
the same time forcefully rejected by many others. In general, however, their rejection 
causes increased complexity or idiosyncrasy at some other point of the analysis. The 
choice thus seems to be between accepting a zero for a more general semantics vs. a 
WYSIWYG account at the cost of generality. For present purposes, zero markers are 


only modestly allowed and avoided whenever possible. 


3 Analyses of Place 


3.1 Place as the absence of mode 


If Place is really the absence of mode, as again schematically represented in (10), it 
should not appear.’ For if Place overtly marked the absence of Goal and Source, we 
probably would want to analyze it as a mode distinction itself. That is, more generally, 
whereas specific levels of a function may be defined negatively with respect to other 
levels (e.g. that as ‘not this’), we probably do not expect a linguistic expression to 


express the absence of an (abstract) function (e. g. the in terms of the absence of deixis). 


(10) Place as the absence of mode: 


[mode {Source, Goal} [configuration fin’, ‘under’, etc} ] ] 


In some languages, the absence of a change of location is indeed covertly expressed 
only, and therefore, on the basis of these languages, Place could be said not to exist (“to 
be the absence of mode”). Rather than using an exotic spatial case paradigm, English 


prepositions may illustrate this type of mode system: 


4 The non-existence of a Place marker crucially sets this analysis apart from the others. 
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(11) The mouse ran... 


. across the floor. 
b. ... from under the table. 


c. ... into its hole. 


Whereas Goal and Source are overtly marked in (11-b,c) as indicated with bold face, 
there is no additional marking in (11a). The (relevant part of the) English spatial system 


can thus be represented as follows: 


(12) English spatial expressions: 


[node {from, to(/-to) [configuration {‘in’, ‘under’, etc.} ] ] 


In this analysis, the absence of Goal and Source is taken to correspond to the absence 


of the mode function in general. 


3.2 Place as a generalized configuration 


If place is a generalized configuration, it should not occur in combination with more 
specific configurations, as these should be mutually exclusive: From a functional per- 
spective, it does not make much sense to standardly, that is, not as a restatement but as 
the normal way of expression, mark something in general and at the same time express 
it in more detail too (cf. “a vehicle car, for an attempt to illustrate with a lexical exam- 
ple). According to this analysis, Place always substitutes more specific configurations. 


The abstract semantic representation is repeated as (13) for convenience. 


(13) Place as a generalized location: 


[mode {Source, Goal} [configuration {Place, ‘in’, ‘under’, etc.} ] ] 


Although Place in principle may be expressed covertly under this analysis, it could then 
also be argued to favor the type of analysis to be discussed next. Therefore, we will only 
consider overt instances of generalized configurations in this section. 

The locative suffix -(i)ng in Tswana (a Niger-Congo language spoken in South Africa) 
could be analyzed as a generalized configuration. Tswana has a subset of nouns used in 
spatial function without the addition of the locative case marker. Spatial configurations 
are specified by means of prepositions that are historically locational nouns (Denis 
Creissels, p.c.). These constructions, from which the locative case marker is lacking, 


are used if the configuration needs to be expressed explicitly: 
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Tswana (Creissels, p.c.) 

(14) morago ga lebota 
behind GEN wall 
‘behind the wall’ 


The locative suffix does not appear on top of such configuration markers but seems 
to be used in their stead, when a more specific expression is considered superfluous. 
Consider the following examples. 

(15) Tswana (Creissels, 2006a, 23) 
a. Monna o dule motse-ng. 
man s3:1 leave.pFr 3village-LOC 


‘The man left the village’ 


b. Monna o ile noke-ng. 
man s3:1 go.PFT 9river-LOC 


‘The man went to the river’ 


The configurational interpretation of the locative suffix depends on the type of 
ground (probably ‘in’ for villages and ‘at’ for rivers); Mode is contributed by the motion 
verb (Source in (15a) and Goal in (15b)). Note that not all verbs of movement are able to 
contribute Goal or Source mode (cf. Reshöft and Lestrade 2013 for more elaborate dis- 
cussion on spatial-meaning dimensions expressed by motion verbs). Verbs that mostly 
express manner of motion, such as taboga ‘run’, akofa ‘hurry’, fofa ‘fly’, and feta ‘pass’, 


do not contribute mode: 


Tswana (Creissels, 2004, 11) 
(16) Ke tlaa taboga ko tsele-ng 


sis FUT run-FIN DISTANT 9road-Loc 
‘I will run on the road’ 
In sum, in Tswana the locative suffix -(i)ng seems to be used to generalize over 
specific configurations. If more specific configurations are expressed, it is not used. 
Also, it does not add any mode meaning whatsoever, a function that seems restricted 


to motion verbs (or applicative markers, cf. Creissels 2004). 


3.3 Place as the locative function 


To tell apart an analysis of Place as the locative function and the previous analysis, 
its expression should occur between mode expressions and overt configuration expres- 


sions: 
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(17) Place as the locative function: 


[mode{Source,Goal} Locative function Place [configuration fin’, ‘under’, etc.} ] ] ] 


In spatial systems of this type, Source and Goal systematically have to be built on 
top of Place, which intervenes between mode and configuration expressions. Although 
in principle here too Place may be expressed by a zero marker, we will not consider this 
scenario as we then cannot distinguished the present from the previous analysis. 


Consider the following examples from Malayalam: 


(18) Malayalam (Asher & Kumari, 1997) 

a. Avan viit{-il unf. 
He house-Loc be.PRES 
‘He is at home’ (p. 225) 

b. Ninnalkke kitakkay-il kitakkaam; allenkil paayayil 
you-DAT bed-Loc _ lie-PERMIS otherwise mat-Loc 
kitakkaam. 
lie. PERMIS 


“You can lie on the bed or you can lie on the mat’ (p. 139) 


c. Addeham innale talayoolapparamp-ileekkə 
he.mon yesterday Thalyolaparambu-ALL 
pooyi. 
£0.PAST 


‘He went to Thalyolaparambu yesterday.’ (p. 182) 


d. Avan viit{-il ninna innale vannu. 
he house-Loc from yesterday come.PAST 


‘He came from home yesterday. (p. 226) 


The locative case marker -il in the first two examples generalizes over whatever specific 
configurations may hold in the real world between the locatum and the ground (‘in’ in 
(18a) vs. ‘on’ in (18b)). Goal and Source expressions are added on top of this marker: The 
alllative Goal marker in (18-c) can easily be decomposed into the locative marker plus - 
eekko and the Source postposition ninno is used in addition to the locative case in (18-d). 
Thus, the markers for Goal and Source are both added on top of the suffix -il, which does 
not seem to express any specific configuration itself, but whose interpretation rather 
seems dependent on the type of ground. So far then, the locative case in Malayalam 


behaves similar to that in Tswana, which was argued to have the locative function. 
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Very differently from the situation in Tswana, however, the combination of con- 
figurational expressions and Place seems very well possible in Malayalam, suggesting 
that the analysis of Place as a generalized configuration may not be right. The loca- 
tive marker can be recognized in many configurational expressions, such as munpil ‘in 
front of’ and pinnil ‘behind’ (although this is not always possible, cf. mite ‘above’, meel 
‘on’), and also examples of the “complete” structure in (17), using both mode, locative 


function, and configuration, are easily found, as illustrated for Source in (18): 


(19) Avan vaatilinre  pinn-il ninno vannu. 
he door-GEN behind-toc from  come.PAST 


‘He came from behind the door’ 


These combinatory possibilities then suggest an analysis in terms of the locative 
function. Note however that if we analyze the locative marker in Malayalam in terms 
of the locative function, the linguistic specification of configuration has to be optional, 
as it would then be lacking from (16a-b). (Again, reduced complexity at one level causes 


increased complexity at some other place.) 


3.4 Place as a distinction of mode 


Finally, Place could be a full-fledged mode distinction. In this case, we expect it to be 
mutually exclusive with Source and Goal, all three being expressed on top of configura- 


tion expressions: 
(20) [mode {Source, Goal, Place} [configuration {‘in’, ‘under’, etc.} ]] 


A pattern that suggests this type of analysis can be observed in Northern Akhvakh. 
Creissels (2009, 5) shows that the spatial case paradigm of Northern Akhvakh can be 
decomposed into a configuration and mode marker. As illustrated in Table 1, the spatial 
paradigm consists of complex markers that combine a configurational and a mode mor- 
pheme. For example, the Place morpheme -e/i is put on top the configuration -1° ‘under’ 
to express ‘under’; if the Source marker -a(je) is added to this configuration instead, we 
get ‘from under’. 

Crucially, Northern Akhvakh has an independent Place marker on top of the con- 
figuration markers that is in complementary distribution with the other mode markers. 
We can observe similar patterns in the spatial case paradigms of for example Hungarian 
and Finnish. Since Place patterns with the other mode distinctions in these systems, one 


could argue that it is a mode distinction too. 
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Place Source Goal 
default configuration -g-e -g-a(je) -g-u(ne) 
‘in the vicinity of’ ~xar-i -Lir-a(je) ~xar-u(ne) 
a. ‘in a relatively narrow space’ -q-e -q-a(je) -q-u(ne) 
b. ‘distributed or diffused localization’ 

‘under’ -T -i -X -a(je) -1 -u(ne) 
a. ‘in a filled dense space’ -L-i -L-a(je) -L-u(ne) 
b. ‘on a non-horizontal surface’ 


Table 1: Northern Akhvakh spatial case paradigm 


Slightly more complex evidence can be derived by considering the case forms of 
the spatial adpositions of these languages. Hungarian has ten spatial cases in total, 
distinguishing three mode options for three very general configuration distinctions (ap- 
proximated by ‘in’, ‘at’, and ‘on’; only the latter of which was illustrated in Section 
1) and having an additional terminative case that does not combine with these three 
configurations. In addition, Hungarian can make use of adpositions to express spatial 
meaning. The stems of these spatial postpositions express specific configuration dis- 
tinctions, whereas their case forms specify mode. This is illustrated in the following 
example (cf. also Creissels 2006b and Stolz 1992): 


(21) Hungarian (Hegedtis 2008, 221) 


a. a haz mellett 
the house beside.PLACE 
‘beside the house’ 

b. a haz mellé 
the house beside.GOAL 
‘(to) beside the house’ 

c. a ház mellől 
the house beside.SOURCE 


‘from beside the house’ 


As shown in (21), the adposition stem expresses configuration whereas its different case 
forms distinguish between modes. Thus, instead of combining with all ten spatial cases 
that are available in Hungarian, the case paradigm of Hungarian postpositions only 


5 


makes a three-way mode distinction.” This reduced spatial case paradigm can easily 


be explained from a functional perspective: Spatial adpositions in Hungarian make 


5 We find a comparable situation in Finnish, discussed in Lestrade 2010. For a cross-linguistic overview of 
the distribution of labor between cases and adpositions within complex spatial PPs, cf. Lestrade et al. 2011. 
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a much more fine-grained distinction in configuration than spatial cases. The very 
general configuration distinctions that are made by the nominal spatial case paradigm 
are therefore redundant on adpositions and only a mode distinction is necessary (cf. also 
the argumentation in Section 3.2). Importantly, Place is one of the mode distinctions 
that are formally distinguished in the case paradigms of these adpositions, not one of 
the configuration distinctions that are omitted. This again suggests that, in Hungarian, 


Place belongs to the mode domain, taking configurations as its input. 


4 Discussion 


Above, we have seen evidence for different proposals for the analysis of Place. In this 
section an attempt is made to link these various systems in a diachronic sketch of the 
possible development of Place. 

It can be hypothesized that Place first emerges in a language as the result of a gram- 
maticalization process in which the most frequently used configurational expression 
grammaticalized to such an extent that it no longer inherently expressed any distinc- 
tion whatsoever (cf. a.o. Lehmann, 1985). Place, at this stage, has become a generalized 
configuration, its locative function and mode interpretation resulting from contextual 
enrichment. In the development of new configuration markers, necessary to commu- 
nicate specific configurational meaning, Place-as-a-generalized-configuration could be 
used to explicitly mark these markers for their new role. Thus, Place comes to express 
the locative function. Malayalam, discussed in Section 3.3, possibly could be said to 
illustrate this transition stage. In a next stage of grammaticalization, a language may 
develop a morphological mode system to provide a temporal specification of this Place- 
with-the-locative-function in combination with motion events. Languages may develop 
a Source marker that restricts the Locative function to a (time) interval before a change 
and/or a Goal marker that restricts it to an interval after a change. Since Source and 
Goal have the Locative function as their default input, their markers can either be used 
on top of the former locative marker (reflecting their semantic relation), or in contrast 
with it (as the default input of a function need not be expressed). 

Interestingly, the two case systems that emerge at this point in our sketch nicely 
correspond to the syncretism patterns that are attested cross-linguistically. If only a 
two-way mode distinction is made with a special Source marker, the former locative 
marker will come to express non-Source mode, i. e. be compatible with Place and Goal. 


If, on the other hand, a two-way mode distinction is made with a special Goal marker 
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only, the former locative marker expresses non-Goal mode, i. e. Place and Source. What 
is not expected is the development of a mode function that only says that the locative 
function should be linked to a motion event instead of a stative one. As explained in the 
introduction, this is not very informative and therefore such a marker is unlikely to de- 
velop. Indeed, virtually the only attested spatial syncretism patterns are between Place 
and Source or Place and Goal (cf. Stolz 1992; Creissels 2009; Pantcheva 2010; Lestrade 
2010; cf. Kutscher 2010 for a synchronous exception that can be explained away via 
phonological attrition). If a second mode distinction is developed (Source, if Goal was 
already there and vice versa), the Place-with-the-locative-function marker will first ex- 
press Place by pragmatic reasoning only: If the location is not restricted to a subinterval, 
it is interpreted as holding throughout the event. Eventually, however, Place-with-the- 
locative-function can be expected to end up expressing a mode distinction directly by 
semantic strengthening, that is, by not deriving the Place-as-a-mode interpretation in- 
directly, but by establishing the link in its lexical semantics. Thus, Place-as-a-mode 
could be considered to be the fossilized version of Place-with-the-locative function and 
should only emerge in mode systems in which the two other basic modes Goal and 
Source are developed first (cf. Wilkins and Hill 1995 for such a diachronic relation be- 
tween a “pragmatic” and a “semantic” phase; cf. Blutner 2007 for a similar use of the 
notion fossilization). 

The following example may illustrate this last stage of the development in progress. 
As shown in (22b) for Goal only, in Imonda the markers for Goal and Source are used on 
top of the Place marker, whose independent use is illustrated in (22a). However, as (22c) 
shows, sometimes it is possible to omit the latter and use the Goal marker directly on 


the ground. 


(22) Imonda (Seiler 1965) 


a. iéf-ia 
house-Loc 
‘at the house’ (p. 71) 

b.  léf-ia-m ka uagl-f. 
house-Loc-Goal I gO-PRES 


‘Tam going home’ (p. 161) 


c. Ném at uagl-n. 
bush-Goal COM go-PAST 
‘He has gone to the bush’ (p. 161) 
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The optionality of the locative marker could be understood as the beginning of a process 
in which the Place marker changes from the input of the modes Goal and Source into 
a mode distinction proper: If (22a) and (22c) are contrasted, one could say that Place 
and Goal are developing a complementary distribution, which may subsequently lead 


to their equivalent status semantically. 


5 Conclusion 


This paper discussed the status of Place markers in a cross-linguistic sample of spatial- 
case inventories. It was proposed that a uniform analysis of Place cannot be given but 
that languages may have very different spatial systems instead. In some, Place should be 
considered a generalized location, in others, it can have a locative function explicitly es- 
tablishing the link between locatum a location, and in again other languages, Place may 
function as a full-fledged mode distinction contrasting with Goal and Source meanings 
that are universally accepted as modes. Thus, in some languages the mode dimension is 
obligatorily marked whereas in others this is only done when deemed necessary. 

The different options were hypothesized to be diachronic variants rather than (onto)- 
logical opposites. Place may start out as the result of the interpretation of the locative 
function in a system of pragmatic contrasts with Source and Goal. From this, it can be 
expected to develop its own inherent mode semantics by pragmatic strengthening. 

Whether this grammaticalization hypothesis is right or wrong, our semantic repre- 
sentations of spatial meaning should probably at least have the possibility of allowing 


Place as mode distinction to account for the variation described here. 
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Abstract 


Aspectuality has been claimed to be determined by the same principles in both literal 
and idiomatic readings of equivalent structures. In this paper, we analyze the English 
V one’s BODY PART out/off idioms which correspond to a pattern of intensive meaning 
construction involving a change in the interpretation of the aspectual classes of their 
VPs. This class of idiomatic constructions denotes systematically a change of location 
undergone by a body part at the source domain which is metaphorically projected into 
the target domain which denotes an event carried out in an intensive fashion. The ac- 
tivation of metaphorical modes of thought is the foundation of the two-level integration 
model advanced here as a semantic compositional representation (semantic pole) of the 
idiomatic constructions. The model, blended in nature, gives rise to emergent structures 
which are foregrounded with respect to the unitary integration process. The interac- 
tion between the cognitive operations involved in the construction of the final idiomatic 
meaning is argued to motivate the shifts toward atelicity of the idioms analyzed. 
Keywords: Lexical Aspect; Aspectual Shifts; Idioms; Cognitive Grammar; Fake Resul- 
tatives 


1 Introduction 


The main question to be addressed in this paper is whether the aspectual properties of 
idiomatic constructions can be determined according to the same principles we would 
use for non-idiomatic ones. We take the issue by focusing on a specific pattern of 
intensive meaning construction in English: the V one’s body part out/off idioms. In 
particular, we provide an analysis of constructions of the type John laughed his head off 
(John laughed intensely/a lot’) and she cried her eyes out (‘She cried a lot’) where the 
intensity of the action is systematically conveyed by a caused removal of a body part 


expressed in the linguistic structure. 
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The activation of this metaphorical mapping has consequences for the conceptual 
interpretation of aspect which appears to be constrained by high-level cognitive op- 
erations. In fact, under the literal reading of a construction containing the same VP 
(e.g. the audience laughed the actor off the stage), a different aspectual class would be 
involved. In more detail, under the idiomatic reading, the (unreal) eventuality can be 
associated to an atelic resultative construction (a fake resultative in terms of Jackendoff 
1997) while under the literal reading the sentence can be defined as a telic resultative 
construction. These aspectual shifts have been motivated by advancing metaphorical 
modes of thought dynamically activated in the process of idiom comprehension (Mateu 
& Espinal to appear, 2010 after Gibbs 1994, Lakoff 1993, Lakoff & Johnson 1999). 

The formulation of the metaphor an intensive action is a change of location (Mateu & 
Espinal 2010) will be the basis for the application of the so-called Force Change Schema 
(Broccias 2003) used as the semantic pole for resultative constructions and adapted to 
the data discussed in the present study to propose a possible compositional path for 
their idiomatic meaning. The model, structured by two level of successive conceptual 
integration, will be advanced as a schematic representation for the meaning implica- 
tions involved in the idiomatic pattern. The general goal of this paper is to investi- 
gate the cognitive operations involved in the conceptual interpretation of the aspectual 
properties related to different classes of predicates and to account for the shifts toward 
atelicity which affect certain classes of idioms like the ones under examination. We 
begin by discussing the notion of lexical aspect and its relevance within the Cognitive 
Linguistics framework in subsection 2.1. 

In subsection 2.2, we provide an overview of previous accounts which have specifi- 
cally dealt with idioms and aspectuality. In particular, we will consider as valid metaphor- 
ically driven approaches to idiomatic interpretation (Espinal & Mateu 2010) as opposed 
to formal treatments of idioms (Jackendoff 1997, McGinnis 2005, Glasbey 2003) which 
see idiomatic meaning as a combination of the properties of their syntactic constituents. 
In section 3, (i) we advance our proposal by introducing the problem of aspectual shifts 
and examining the cognitive operations involved in idiom comprehension and (ii) we in- 
troduce the two-level integration model as a heuristic representation of their semantics. 


We conclude with some final comments conclusions in section 4. 
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2 Background 


2.1 The Inherent Structure of Events 


The first point that we feel the need to clarify for a proper coverage of the topic is 
the distinction between the notions of grammatical aspect and lexical aspect (or Ak- 
tionsart). In the Cognitive Linguistics literature, scholars do not always support the 
different implications of the separation between the two types of aspect and this is 
not astonishing given the impossibility to mark a clear-cut grammar/lexicon distinction 
(Boogart and Janssen 2007). However, when it comes to aspectual shifts, we assume 
Vendler’s classification (Vendler 1967), and implicitly the relevance of lexical aspect, for 
two main reasons. 

First, we argue that there is a correlation between the inherent structure of events 
and the typical abilities for apprehending and tracking relationships claimed in Cog- 
nitive Grammar, namely the notion of scanning (Langacker 2008: 111). In fact, how 
component states of an event are accessed and conceptualized crucially relates to the 
binary properties assigned to the aspectual classes. Second, we endorse the defini- 
tion of aspect provided in Croft (2012) according to which lexical aspect describes how 
events are construed as unfolding over time and, thus, a two-dimensional analysis of 
aspectual types is required in order to investigate the semantic complexity of aspect 
and the conceptualization processes that intervene in the relationship between aspect 
and Aktionsart. Basically, two general approaches to aspect can be distinguished in 
the literature (Croft 2012, Michaelis 2004): unidimensional and bidimensional. In uni- 
dimensional approaches, there is no difference between the semantics of grammatical 
and lexical aspect. In bidimensional approaches the two types of aspect are seman- 
tically distinct. In the present account, we assume Croft’s (2012) construal approach 
according to which aspectuality has to be defined according to the semantic structure 
of predicates and inferred from the interpretations of predicates in different tense/aspect 
constructions. In other words, events may involve different perspectives, and then the 
possibility of viewpoint shifts in terms of aspectual construals is fundamental to capture 
the differences in the inherent structure of events. Since the analysis presented here is 
essentially focused on the lexical aspect of different classes of predicates, we assume 
as a starting point the basic Vendlerian classification into four different categories of 


lexical aspect. 


(1) States: be sick [stative, durative, atelic] 


(2) Activities: sing, run [non-stative, durative, atelic] 
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(3) Achievements: sink [non-stative, punctual, telic] 


(4) Accomplishments: build [non-stative, durative, telic] 


Generally speaking, these classes are defined according to three binary distinctions: 
stative/non-stative, punctual/durative, telic/atelic. The present analysis is concerned 
with detelicization processes in idiomatic contexts, namely aspectual shifts from a telic 
to an atelic interpretation of a predicate when an idiomatic expression has the same 
syntactic structure, or at least the same verb phrase, as a non-idiomatic counterpart. 

In particular, states describe situations that are both stative and durative since they 
do not change and last over time. Activities describe both dynamic events and processes 
and involve a change over time. Additionally they do not have an inherent endpoint. 
Processes are also instantiated by the Achievement class but provide as well a culmi- 
nation of the event in a punctual point in time. Accomplishments involve a process 
resulting in a change of state that lasts in time. The typical diagnostic procedure to 
define the aspectual class of a verb is the modification by the container and durative ad- 
verbials (Croft 2012). The in-phrase and for-phrase modification (as originally dubbed 
in Vendler 1967), commonly used to distinguish between telic and atelic events, indicate 
respectively the length and the span of time over which the event occurred. 

These diagnostics will provide the analysis with crucial insights to define the as- 
pectual properties of the data discussed in the present paper. Other methodologies 
have been applied to define more specifically the properties of the four categories, even 
though their semantics may overlap and, accordingly, the predicates may belong to dif- 
ferent aspectual classes. This comes as no surprise given the fact that each category 
shares at least one property with the other three categories part of the taxonomy. Now, 
we are going to describe how this potential overlapping has been diagnostically disen- 
tangled. The present progressive what are you doing? test has been applied with respect 
to the stative/non-stative distinction, and in particular to differentiate states (to know) 
from activities (to laugh), since both are durative and atelic but display a divergence in 


terms of the dynamicity of the event. 


(5) What are you doing? *I am knowing. 
(6) What are you doing? I am laughing. 


Finally, two other tests are used to make a distinction on the one hand between accom- 
plishments and the other three categories, on the other hand between states and the rest 
of the taxonomy: it took me/him/her/us-TIME INTERVAL- to test and do you - STATE? 
test. 
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(7) It took them two months to build the castle. 
(8) Do you know the truth? Yes, I do. 


Vendler (1967) posits other diagnostic questions to distinguish achievements from states. 
The at what moment?-test and the for how long?-test are used to point out the compati- 
bility of achievements with the first temporal question while states are fine if modified 
by the second one. Inverting the test to evaluate the nature of the predicates for the two 


classes will lead to semantic inappropriateness, or more drastically to ungrammaticality. 


(9) At what moment did the ship sink?/*“At what moment have you been sick? 
(10) For how long have you been sick?/*For how long did the ship sink? 


However, even if helpful, the above-mentioned tests do not solve completely the exact 
attribution of the aspectual properties to the individual classes, being this an operation 
crucially influenced by usage-based facets and viewpoint factors, besides the morpho- 
logical/inflectional elements that, in some languages, play a role in the definition of 
aspect (Dahl 1985). 


2.2 A Conceptual Metaphor Account of Aspectuality 


The model presented in this paper to account for the cognitive operations that intervene 
in the conceptual interpretation of aspect and constrain the attribution of the aspectual 
class to the VP in idiomatic context, is based on a previous analysis advanced in Espinal 
& Mateu (2010) and Mateu & Espinal (to appear) which has posited the activation of 
metaphorical modes of thought as the fundamental motivation for the atelicity of idioms 
like (11) and (12). 


(11) John worked his guts out all day long/*in ten minutes. 
(12) John laughed his butt off all day long/*in ten minutes. 
(Mateu and Espinal to appear) 


In particular, the above sentences, which appear to fall in the class of fake resultatives, 
are compared to telic resultative constructions in (13) and (14) associated to literal 


interpretations. 


(13) The audience laughed the actor off the stage in/*for ten seconds. 
(14) She worked the splinter out of her finger in/*for ten seconds. 
(Mateu and Espinal to appear) 


By claiming the activation of conceptual metaphors, the study demonstrates how the 


idiomatic readings in (11) and (12) can be associated to durative activities (given also 
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possibility to modify the sentence by a for-phrase) and goes beyond Jackendoff’s claim 
that VPs in fake resultatives like are interpreted as “V excessively” and Glasbey (2003)’s 
argument according to which in the non-literal sentences there is no gradual patient re- 
lationship. The intuition to deal with fake resultatives in terms of conceptual metaphor 
is inspired by Goldberg (1995)’s account of true resultatives, which in her Construc- 
tion Grammar approach are seen as a metaphorical extension of the caused-motion 
constructions of the type John kicked the bottle into the yard. Resorting to the basic 
conceptual metaphor change of state is a change of location the resultative construction 
structure is ‘inherited’ from the caused-motion. Different formulations of the specific 
conceptual metaphors involved in the interpretation of the idioms in (11) and (12) are 
provided in Espinal & Mateu (2010). First, the conceptual mappings involve the pri- 
mary metaphor the body as a container since a figurative extraction of body part from 
the container occurs at the source domain and is mapped into the target domain that 
is the more abstract intense action. In their terms, the action carried out in an excessive 


fashion is expressed in the linguistic structure by a displacement of a body part. 


(15) AN INTENSE ACTIVITY IS AN EXCESSIVE DETACHMENT (OR EX- 
HAUSTION) OF A BODY PART 


The metaphor as formulated in (15) is a subset of the more general (complex) conceptual 
metaphor in (16) which is responsible for the interpretation of idioms like (11) and (12) 


as durative activities. 


(16) AN INTENSE ACTIVITY IS AN EXCESSIVE CAUSED CHANGE OF 
LOCATION/STATE 


In particular, the change of location denoted by the directional paths (out or off) is pro- 
jected into the domain of the activity, characterized as ‘so intense that they appear to 
lack boundaries’ (Mateu & Espinal to appear). We acknowledge the role of the concep- 
tual metaphor in the definition of aspect in idiomatic contexts but at the same time we 
claim that it is insufficient to account exhaustively for the cognitive modes of thought 
involved in meaning construction which constrain the final atelic interpretation of the 


idiomatic constructions. 


3 A Conceptual Analysis of Aspectual Shifts 


In the present study, an aspectual shift is claimed to occur (in certain classes of idioms) 


when a VP, that allows both a literal and an idiomatic reading, can be associated to 


152 


Force Change Schemas and Excessive Actions 


different aspectual classes depending on the interpretation that is accessed according 
to contextual information and communicative purposes. More classes of idioms have 
been argued to be affected by aspectual shifts toward telicity. The V one’s BODY PART 
idioms, examined in the present paper after Espinal & Mateu (2010), are one of those 
classes. Furthermore, relevant counter-examples, undergoing the same types of shifts 
and involving the same patterns of conceptual interaction have been proposed for Ro- 
mance languages (e. g. Italian, see Bellavia 2012). Let us take into analysis the following 


minimal pair: 


(17) The audience laughed the actor off the stage in ten seconds/*for then seconds. 


(18) John laughed his head off for ten seconds/*in ten seconds. 


The verb to laugh under the literal and the idiomatic readings is associated to two 
different aspectual classes, respectively. In (17), the possibility to modify the event 
by using an in-phrase adverbial allows us to define it as telic (accomplishment). The 
same cannot be said for (18), where the VP under the idiomatic interpretation denotes 
a durative activity. The problem at issue is complex and relates to different factors. 
First of all, the question we should find an answer to is how the aspectual properties 
of the same VP can be different in the two relevant readings. Then, we should find out 
whether it is a problem that can be explained by looking at the structural components 
of the sentence or we need to appeal to the conceptual interpretation of aspectuality. 

We claim that the change in the aspectual properties can be accounted for by con- 
sidering the cognitive operations involved in the conceptual mapping between two do- 
mains of experience, namely the concrete change of location expressed in the struc- 
tural components of meaning and the intensity of the action expressed by the idiomatic 
meaning. These semantic implications are heuristically represented using a two-level 
model of conceptual integration where, at the first level, the integration will involve two 
components of meaning giving rise to the single sentence unit of the idiom like in John 
laughed his head off; at the second level, the integration will affect the two domains of 
experience implicated via metaphorical activation. The details of the semantic model 
are described in more detail in the next section. 

Following the main tenets of Cognitive Grammar (Langacker 1987, 1991), we argue 
that idiomatic constructions involve at the semantic pole a complex scene that consists 
of a final foregrounded meaning as a result of a compositional path which corresponds 
to the process of assembling of their semantic structure. The purpose of the composi- 


tional path is to capture in a unitary fashion all the meaning implications, patterns of 
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figurations (Langlotz 2006) and cognitive operations involved in idiomatic interpreta- 
tion. The phonological pole implies the same configuration as the one correspondent to 
a potential literal scene implied by the sentence. In this sense, the literal scene “works 
as the scaffolding against which the idiomatic meaning is conceived” (Langlotz 2006: 
108). Once the idiomatic meaning can be accessed via patterns of figuration which pro- 
vide a conceptual basis to make sense of its semantics, it will be foregrounded. In the 
background, the literal scene will be still available but as a more concrete domain from 
which the conceptual structure is imported, or — to put it in terms of Langlotz (2006) - 
as standard of comparison for the foregrounded idiomatic meaning. 

We argue that the meaning implications involved in the idiomatic construction in 
(9b), carry out aspectual information and since the displacement of the body is unreal 
and is used as a source domain to make sense of the intensity domain, there is no 
endpoint involved in the idiomatic event. But the inherent scene provided by these 
idioms is much more complex and to represent it properly we resort to the Force Change 
Schema (FCS) as developed in Broccias (2003). The FCS will serve as the conceptual 
“scaffolding” to build up the two-level integrated model implied by the activation of the 
conceptual metaphor an intense action is a change of location which will give rise to 
the foregrounded idiomatic meaning. 

To sum up: the sentence in (17) — associated to a literal reading - can be claimed 
to be a true resultative. We have already seen that, examples such as (18) have been 
defined as fake resultatives since they are conceptually associated to atelic readings and 
there is no semantic relation between the V and the NP. More precisely, there is no 
semantic constraint of patienthood over the NP (Goldberg 1995: 99-100). 

The FCS has been proposed to represent the semantic pole of transitive resultative 


constructions (Broccias 2003: 52) as in the following examples: 


(19) John hammered the metal flat. 
(20) Sally danced herself to fame. 


Interestingly enough, a crucial distinction between (19) and (20) is pointed out in Broc- 
cias (2003: 178). The former conveys a visible condition, the latter a not visible con- 
dition. When a not visible condition is involved the event is said to be carried out in 
an above-the-norm fashion. 

The FCS is a composite structure which results from the integration (in terms of 
Fauconnier & Turner 1996) of a force component (FC) and a change component (CC). In 


a sentence like (17), the FC is the audience laughed the actor, whereas the CC is the actor 
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off the stage. The V is an intransitive verb that is constructed here in a forcible fashion 
and, in terms of Langacker (2009: 256), can be considered as the skewing element of 
the construction, namely an element whose the composite meaning of the expression 
it appears in is incongruent with respect to the verb’s meaning. The schema in Figure 
1 represents the FCS and it is related to the true resultative construction of the literal 
reading in (17). At the FC, the trajector the audience exerts the force instantiated by the 
verb laughed over the landmark the actor. At the CC, the force causes the displacement 
of the element that corresponds to the landmark from an origin to a goal. The path 
off is instantiated by an arrow. The entities that are not in bold are not specified in 
the linguistic structure. In this sense, even if off the stage could be considered as the 
resultant state, no specific entity representing the goal is expressed in the sentence. The 
dotted lines indicate the correspondences between the entities of the two components 


that are integrated in the single conceptual unit (the blend). 


. change 
^, component 


force j 
component / 


FORCE 


trajector 


The audience laughed the actor the actor off the stage 


Figure 1: The audience laughed the actor off the stage 


The point we make in the present paper is based on an extended version of the 
FCS consisting of two levels of integration obtained via metaphorical activation. The 
two-level model provides a schematic description of the semantic pole of the idiomatic 
construction in (18) and is representative of fake resultatives. As represented in Figure 
2, at the first level (exactly like the literal reading) the integration between the FC and 
the CC results into a single conceptual unit. Thus, we have a force exertion of the verb 
to laugh from the trajector John over the landmark head at the FC, and a displacement 
(head off) from an origin toward a goal at the CC. Given the coreferentiality of the 
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possessive determiner with the subject the origin coincides with the trajector. We claim 
that the first-level integration occurs within the source domain that is the change of 
location. 

The interaction of this domain with the target domain intensity conceptualized via 
the image-schematic structure scale, giving rise to the final level of integration where 
the event itself of laughing is argued to assume the role of trajector moving along 
the open-ended scale of intensity and providing, thus, no inherent endpoint in the 
event. In fact, as defined in Johnson (1987: 123) the image schema scale may either 
continue indefinitely in one direction or may terminate at a definite point. The concept 
of intensity has been defined in the literature as open-ended, hence we stipulate the 
indefinite value of the abstract concept (00) expressed by the intense action. Still, the 
dotted lines indicate the correspondences between the entities of the two components 


that are integrated into a single conceptual unit. 


trajector 


force 


f: change 
component / 


KS component 


INTENSITY 


latidmark *, 
FORCE 
trajector 


John laughed his head his head off 
CHANGE OF LOCATION 


Figure 2: John laughed his head off 
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The single conceptual unit of the second-level integration will be the salient part 
corresponding to the foregrounded idiomatic meaning. Blended spaces are the result 
of projecting source domain onto target domains. Furthermore, conceptual units which 
are the result of blending operations are hybrid (Langacker 2008: 51) in the sense that 
they combine and foreground selected features of each input space. In the same way, at 
the end of idiom comprehension, the speaker will select the intense activity because the 


final level of integration will be in the foreground. 


4 Final Comments 


The proposal advanced as an account for aspectual shifts has been focused on the 
cognitive operations involved in idiomatic meaning construction and its processing. 
Our main concern has been to explain the sistematicity of the expression of intensive 
actions via a caused removal of a body part. In this respect, we have claimed a two-level 
integration model as a representation of the unitary compositional paths entailed by the 
semantics of the V one’s body part out/off idioms. 

The model - based on the Force Change Schema (Broccias 2003) consisting of a 
single conceptual unit as a result of the integration between a force component and a 
change component — implies a second level of integration given by the activation of the 
conceptual metaphor an intense action is a change of location, first proposed in Espinal 
and Mateu (2010). The atelicity of the events has been assumed to be caused by the 
unbounded nature of the concept of intensity involved in the target domain. We have 
also argued that the conceptual mappings allow the different experiential domains to 
be integrated in an emergent structure that, given its complex blended nature, results in 


a foregrounded space, namely the final level of idiom processing. 
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Abstract 

The aims of the present studies are to assess the sensory nature hypothesis of knowledge 
through a series of experimental results. Especially, we investigated the links between 
memory and perception using a short-term priming paradigm based on a previous learn- 
ing phase consisting of the association between a geometrical shape and a white noise. 
Consequently, the priming phase examined the effect of a geometrical shape, seen in 
the learning phase, on the processing of a target (tones or picture). Our main results 
demonstrate that memory and perception share some mechanisms and at least com- 
ponents. These ones are involved for the processing of each form of knowledge (i. e., 
episodic and semantic). At last, reflections about the implication of this work to study 
perceptual learning and memory are presented. 

Keywords: Perception, Integration, Multisensory Memory 


1 Introduction 


How do people represent information in memory? What is the nature of the informa- 
tion stored in memory? We can consider that learning representations or concepts de- 
pends on upon perceptual experiences. In that view, the comprehension of the relation 
between memory (i. e., concepts) and perception (i. e., percepts) is critical. Classicaly, 
perception and memory are vertically describded. In that case, perception extracts per- 
ceptual units from the environment thanks to bottom-up processes. These units are then 
converted into representations and are stored into memory. In return, the activation of 
these representations can influence the perception thanks to top-down processes. In 
that conception, the differences between memory and perception are both structural 


and functional (e. g., Humphreys & Riddoch, 1987). Regarding the structural distinction, 
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recent neuroimaging studies suggest that both memory and perception share common 
brain areas (for a review, see Versace, Labeye, Badard & Rose, 2009). For instance, Mar- 
tin and collaborators (2000) showed that conceptual processes (i. e., word-object nam- 
ing) and perceptual processes (i. e., picture-object naming) involve the same brain area, 
depending on the perceptual (i. e., color) and motor properties of the objects. Regard- 
ing the functional distinction, recent neuroimagering researches also suggest that the 
neural structures of long-term memory are involved during the perception of objects 
or events (see Murray & Bussey, 2007). In particular, the medial temporal lobe cortex 
ensures the integration of the different components of objects by means of a hierar- 
chical integration mechanism. Recently, Shimamura and Wickens (2009) have provided 
evidence in support of the idea that memory activities (e. g., single item recognition) 
might be underpinned by this integration mechanism 

In this paper, we aim at developing a conception in which perception and memory 
are at the same functional level in cognitive architecture. In other words we want to 
bring experimental evidence that perception and memory act simultaneously on the 
same processing units. The only difference is that perception involves perceptually 
present units whereas memory involves reactivation or simulation of these units. Seek- 
ing this purpose, we have to provide evidence that 1) memory is able to keep traces 


from perceptual events; 2) memory and perception use the same processing units. 


2 The perception leaves memory traces 


In the daily life, the organism treats essentially multisensory signals. The unified per- 
ception of a multisensory environment requires not only multiple activations in the 
sensory areas but also the synchronization and the integration of these activations (e. g., 
King, 2005). The existence of multisensory integration is particularly well illustrated by 
the McGurk effect (McGurk & Mac Donald, 1976). This effect reveals that subjects tend 
to perceive /da/ when they see the syllable /ga/ and hear the sound /ba/. This demon- 
strates the ability of a sensory system to modify the processing of another sensory 
system. Integration could be described as the capacity of the perceptual system to pro- 
cess more efficiently (or differently in case of McGurk effect) a multisensory stimulus 
than the sum of these two parts. Number of neurosciences studies was dedicated to the 
study of the multisensory integration between vision and audition. For example, King 
and Calvert (2001) have shown that some neurons in the superior colliculus are more 


highly activated by multisensory than by unisensory stimuli. Similarly, electrophysi- 
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ological studies have provided some evidences of audiovisual integrations (between a 
shape and a tone) that occur in the visual cortex after a period of just 40 ms (Giard & 
Perronet, 1999). In the same vein, authors have shown that spatial congruity enhances 
audio-visual integration (Teder-Sdlejarvi, Di Russo, McDonald, & Hillyard, 2001). At 
last, the role of attention during perception of a multisensory event and its consecutive 
integration is not well established (see Fort & Giard, 2002). 

If a visual stimulus and an auditory stimulus tend to be integrated during a per- 
ceptual activity (e. g., perceptual categorization or discrimination), is it possible that 
memory could capture this integration? Once perceived, the perceptual properties of 
a multisensory object can be preserved in memory in the form of a memory trace. 
This is due to an integration mechanism that allows for the creation of durable links 
between perceptual properties within the same memory representation (see Brunel, 
Labeye, Lesourd & Versace, 2009; Hommel, 1998; Labeye, Oker, Badard, & Versace, 
2008). Contrary to simple associative learning (see Hall, 1991), once features are inte- 
grated within an exemplar, it is difficult to access the individual features (see Labeye 
et al., 2008; Richter & Zwaan, 2010). This new unit, once acquired, becomes a func- 
tional “building block” for subsequent processing and learning (in language, Richter & 
Zwaan, 2010; in memory, Labeye et al., 2008; or attention, Delvenne, Cleeremans, & 
Laloyaux, 2009). In this view, the integration mechanism is a fundamental mechanism 
of perceptual learning (see the unitization mechanism, Goldstone, 2000) or contingency 
learning (see Schmidt & De Houwer, 2012; Schmidt, De Houwer, & Besner, 2010). From 
this idea we can make the prediction that once two features have become integrated, 
the presence of one feature automatically suggests the presence of the other. Thus, if 
the simultaneously presentation of an auditory information (a sound) and a visual in- 
formation (a shape) leads to the creation of a multisensory memory trace, then we can 
easily predict that the visual component presented alone, as a prime, should influence 
the perception of a sound targets. We examined this prediction through an original 
paradigm divided in two phases. First, a learning phase (consisting in a shape cate- 
gorization task) in which we manipulated the association between a given geometrical 
shape and a white noise’. As a consequence, participants simply had to categorize a 
shape as a square or a circle (each shape was presented in differents shades of gray). 
It is important to stress that each shape was presented during 500 ms. One of this 


shape was systematically associated with a white noise (presented simultaneously dur- 


1 A white noise is a random signal with a flat power spectral density. White noise is considered analogous to 
white light which contains all frequencies. 
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ing 500 ms), the other not. Then, a priming phase (see Figure 1) in which participants 
watched the geometrical shapes from the learning (as prime) and listened pure tones 
(as target). In this phase, participants had to discriminate the target into high-pitched 
or low-pitched. Our first result was a selective priming effect of the geometrical shape 


seen in the learning phase with a sound on the processing of targets tones. 
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Figure 1: Organization of the priming phase. A prime shape (seen in learning phase), presented at different level 
SOA (100 ms or 500 ms), is immediately followed by a target tones that participants had to categorize 
in low or high-pitched sounds. Notes. SOA: stimulus-onset-asynchrony; ISI: Interval-Inter-Stimuli 


This priming effect could be interpreted as an evidence of multisensory memory inte- 
gration during perceptual learning. Indeed, when participants saw a shape that was 
previously presented with sound, it automatically reactivated the auditory memory 
component associated (see also Meyer, Baumann, Marchina & Jancke, 2007) that is 
able to influence the processing of targets tones. However considering only this re- 
sult gave us any hint about the nature of the auditory memory component. Indeed, 
if memory and perception share the same processing units, then each component of 
the memory trace should be perceptual in nature even when they are reactivated. In 
order to test this assumption we manipulated the SOA during the priming phase. More 
specifically we predicted that reactivation of the sound should interfere with tone target 
processing if only if the SOA between the visual prime and the tone target is shorter 
than the duration of the sound associated with the shape during the learning phase. 
In this case, the interference effect would follow from temporal overlapping between 
previously associated sound reactivation and tone processing. A second and quite op- 
posite prediction followed from different temporal constraints. Indeed, reactivation of 


the sound (by the visual prime) was expected to facilitate tone processing but only for 
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SOAs equal or longer than the duration of the sound associated with the shape during 
the learning phase. In this later case, not any temporal overlap occurred between sim- 
ulation of the learned associated sound and target-tone processing so that target-tone 
processing should take advantage from the auditory preactivation induced by the prime. 


Our results (see Figure 2) were totally in line with these predictions. 
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Figure 2: Interaction SOA*Prime type F(1, 30) = 14.64; p<.001. (a) For 100 ms SOA, significant principal effect 
of Prime type, F(1, 15) = 5.25; p<.05. (b) For 500 ms SOA, significant principal effect of Prime type, 
F(1, 15) = 9.78; p<.01. Results reproduced from Experiment 1 Brunel, Labeye et al., 2009. Notes. Sd 
Prime: prime shapes that were presented with sound during learning phase; NSd Prime: prime shapes 
that were presented without sound. Errors bars represent standard errors. 


We demonstrated that memory keep traces from perception thanks to an integration 
mechanism shared by perception and memory. As a consequence, the presentation of 
one component of a memory trace is able to reactivate the other components (which 
kept all of their encoded characteristics). Once reactivated, a compotenent is able to 
influence the ongoing process (see also Riou, Lesourd, Brunel & Versace, 2011). How- 
ever, according to Nyberg et al. (2000), this kind of effect is limited to the processing 
of episodic knowledge and should not be observed when conceptual knowledge are at 
stake. Indeed, only episodic knowledge should keep some perceptual properties of for- 
mer perceptual events. Such claim suggests the existence of modal and amodal forms of 


knowledge. The next section will be dedicated to this specific issue. 
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3 The Sensory Nature of Knowledge 


What is the nature of our knowledge? Bring an answer to that question is not easy and 
suggests at least two differents perspectives. First, we could consider that each form of 
knowledge is qualitively different and as a consequence differs into their nature (i. e., 
modal vs. amodal). According to Tulving (1995), our knowledge could be viewed as 
semantic or episodic. These two sorts of knowledge depend on the existence of two 
independent memory systems. The semantic memory system is more likely to be in- 
volved in the processing of general amodal knowledge whereas the episodic memory 
system is involved in the processing of specific modal knowledge. Whereas Tulving 
argued that these two kinds of memory are dissociated and differ in the abstractness 
of the information they retain, increasing numbers of studies have demonstrated the 
existence of conceptual representations which nevertheless continue to possess a per- 
ceptual nature (Barsalou, 2005; Barsalou, 2008; Barsalou, Simmons, Barbey, & Wilson, 
2003). Indeed, there is experimental evidence showing that the reactivation of percep- 
tual or body states facilitates later conceptual processing for those concepts that share 
the same perceptual characteristics as the reactivated ones (see Pecher et al., 2004; Van 
dantzig et al., 2008). In that view, memory processes are deeply rooted in perceptual 
and action systems (see Barsalou, 2008) and, as consequence, access to all forms of 
knowledge is linked with automatic reactivation of perceptual or body states. In that 
context we can predict that conceptual processing involve automatic reactivation which 
is not limited to a given sensory memory component but should be observed for each 
diagnostic sensory component associated with a particular concept. 

In order to test that prediction, we designed an experiment based on the same 
paradigm we described in the previous section. The learning phase is still consist- 
ing in learning an incident association between a geometrical shape and a white noise. 
The second phase consisted of a short-term priming paradigm (see Figure 4) in which 
a shape, either associated or not with a sound in the first phase, preceded an object- 
picture. The participants had to categorize this picture as representing either a large 
or a small object (more or less than 50 cm high). We manipulated the SOA as well 
as the nature of the object so that half of the objects were typically “noisy” objects 
(e. g., a blender) whereas the others were typically silent (e. g., a screwdriver). In order 
to perform the task, participants had to recognize the object and reactivate the actual 
size of the object. However, if this reactivation is not limited to the visual component 


and can spread to others diagnostic components (here auditory), we should observe the 


168 


The Sensory Nature of Knowledge 


same pattern of priming effect as described in the previous section but limited to the 


typically “noisy” targets. 
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Figure 3: Organization of the priming phase. A prime shape (seen in learning phase), presented at different level 
SOA (100 ms or 500 ms), is immediately followed by a target picture that participants had to categorize in 
small or large target. 


As depicted in Figure 4, we found a priming effect due to the reactivation of a mem- 
ory auditory component by the visual sound prime (i. e., the shape seen with sound 
during the leraning phase) and limited to the “noisy” targets. As we expected, this ef- 
fect was modulated by the SOA. Indeed, we found an interference effect with a SOA of 
100 ms (Panel A) and a facilitation effect with a SOA of 500 ms (Panel B). 
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Figure 4: Panel A: Interaction Prime type* Target type F F(1, 15) = 10.6 ;p <.01 (a) For Noisy target, significant 
principal effect of Prime type, F(1, 15) = 10.6, p<.01. (b) For Silent Target, F<1. Panel B: Interaction 
Prime type* Target type F Fs(1, 15) = 6.24 ; p<.05 (a) For Noisy target, significant principal effect of 
Prime type, Fs(1,15) = 6.24, p<.05. (b) For Silent Target, F<1. Results reproduced from experiment 
1 Brunel et al., 2010. Notes. Sd Prime: prime shapes that were presented with sound during learning 
phase; NSd Prime: prime shapes that were presented without sound. Errors bars represent standard errors. 
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We interpreted these results as evidence that the component reactivated by both the 
prime and the target has the same nature (i. e., perceptual). Consequently, our results 
provide a strong argument in favor of the idea that access to conceptual knowledge 
is linked to the reactivation of the component dimension integrated within a concept 
(see Barsalou, 2008; Vallet, Brunel & Versace, 2010) which is consistent with a grounded 
view of cognition. In that case, we can consider that an opposition between modal 
and amodal form of knowledge is not appropriate for understanding phenomenological 
distinctions between forms of knowledge. This issue will be discussed in the next 


section. 


4 Discussion 


The aim of this paper was to propose experimental evidences in a favor of a horizontal 
view concerning the relation between memory and perception. In that view, percep- 
tion and memory act simultaneously on the same processing units that are perceptual 
in nature. Indeed, our studies clearly show that the activation of an auditory mem- 
ory component (a component that is not perceptually present) is able to influence the 
sensory processing of a sound or conceptual processing of a typically “sound” concept 
presented later. In that case, we have to consider that memory knowledge are nec- 
essarly sensory-based which is totally consistent with a grounded view of cognition 
(see Barsalou, 2008). So far we can say that: 1) memory keeps episodic traces from 
perceptual events; 2) memory traces integrate perceptual components; 3) the compo- 
nents of a given memory trace keep their perceptual caracteritics; 4) once a component 
is activated, this activation is able to spread to the others and influenced the ongoing 
processing irrespective the cognitive activity. 

However there are remaining issues that we don’t really address in that paper. The 
first concerns the type of processing units (i. e., exemplars vs. features). Indeed, in 
the experiments reported here, participants have implicitly learned, through a simple 
categorization task, that a given shape, which varied through a separable dimension 
(i.e. brightness), is systematically presented with a sound and the other not. We inter- 
preted the fact that only visual prime shapes (whatever the shape’s brightness), which 
were presented with sound in the categorization task, influenced the target’s process- 
ing (sound or picture of typical sound concepts) thanks to an “examplar based” memory 
view (Nosofsky, 1991; Logan, 2002). Each exemplar, which was associated with sound, 


reeactivate it previously encoded sound component. However, we can also interpret 
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our results as an evidence of unitization (Goldstone, 1998) between a psychological fea- 
ture, namely a geometrical shape (i.e. squares or circles) and an auditory feature (a 
white-sound). According to the unitization mechanism, we can say that the perfect 
co-occurrence of an auditory feature and visual psychological feature leads to the cre- 
ation of a new functional feature combining these two features (Schyns, Goldstone & 
Thibaut, 1998). 

In a recent works (Brunel, Vallet, Riou, & Versace, 2009; see also, Brunel, Goldstone, 
Vallet, Riou & Versace, 2013), we tried to experimentally settle between these concep- 
tions of memory storage’. Basically, we used the same experimental design (learning 
phase followed by a priming phase with target tones) as Brunel, Labeye and collabora- 
tors (2009) experiment. Yet, we manipulated two imperfect rules of category learning 


sound-shape frequency association (High vs. Low) in learning phase (see Figure 5). 


Non-lsolated | Isolated 


High Frequency ch | | 
| Low Frequency © i} 


Figure 5: Stimuli used in Brunel, Vallet, Riou & Versace (2009) shape categorization task (learning phase). In this 
example, for the high frequency condition, three squares (“non-isolated”) were presented simultaneously 
with a white noise, whereas one (“isolated”) was presented without sound. Following the same example, 
in the low frequency condition, one circle (“isolated”) was presented simultaneously with a white noise 
whereas the other three ones were presented alone ("non-isolated"). All the experimental conditions were 
counterbalanced between-subjects. 


For the exemplars seen in High Frequency condition of learning, we observed a gen- 
eralization effect in the priming phase. The isolated exemplar (which was presented 
without sound during learning phase) yields same priming effect than exemplars seen 
with sound in learning phase. So, generalization effect that we observed could be inter- 


preted as a consequence of a multisensory unitization between a visual feature (shape) 


2 According to Goldstone (1998) we refer here at « whole imprinting » and « feature imprinting ». 
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and an auditory feature (white-noise) that is an argument in favor of “feature imprint- 
ing” view of memory. Nevertheless, for the exemplars seen in low frequency condition 
of learning, we observed a discrimination effect in priming phase. The isolated exemplar 
presented with sound enhanced the processing of targets tones compared to the exem- 
plars seen without sound during the learning phase. So, discrimination effect that we 
observed could be interpreted as a consequence of a multisensory integration between 
visual features (shape and level of brightness) and an auditory feature (white-noise) that 
is an argument in favor of “whole imprinting” view of memory. Taking together, these 
results suggest existence of multiple levels of representation (i.e., feature and exem- 
plar, see Navarro & Lee, 2002), or multiple levels of processing (i.e., dimensional and 
featural), or both, during retrieval. 

The second issue is related to the first one but concern the ability of the memory 
to produce qualitative and distinct forms of knowledge. We proposed that each form 
of knowledge emerges from the activation and the integration, and the synchroniza- 
tion of multiple memory traces (see also Versace et al., 2009). The difference between 
episodic and semantic is thus no more qualitative but rather quantitative, i.e. in term 
of number of episodes or traces, which are reactivated. We suggest that information is 
maintained in memory through a hierarchical multimodal memory integration mech- 
anism. We consider that this mechanism, as presented in Figure 6, may be of relevance 
for the expression of the different forms of knowledge (e. g., semantic and episodic) 
and the various types of memory processing (i. e., categorization, recognition, memory 
retrieval). 

In this model, an object is assumed to be perceived as a unified object because all its 
features are gradually integrated with one another. However, contrary to the exemplar- 
based approach, we suggest that what is stored in memory is the result of each inte- 
gration at each level of LTM. We argue that a competition is involved during feature 
integration. This competition depends on both the distance between exemplar features 
within and between categories, and on the frequency of the presentation of the com- 
binations of the different features. 

In addition, we suggest that all the levels are not necessarily accessed for the pro- 
cessing of an exemplar in a given task: 1) to categorize an exemplar, it is sufficient to 
activate the unitized dimension which is relevant for the category; 2) to recognize an 
item, it is necessary to activate each unitized feature that is relevant for the exemplar. 

In conclusion, we propose that each form of knowledge emerge from the dynam- 


ics interactions between multisensory units, which are both perceptual and mnesic in 
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Figure 6: Illustration of multimodal hierarchical integration between features in long-term memory (adapted from 
Murray & Bussey, 2007). 


nature. As a consequence, the distinction between memory and perception might be 
only at phenomenological level. In other words, it is the subjective attribution (wether 
to a component perceptually present or absent) to the cognitive activity that would 


determine the nature of this activity. 
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While the symbol grounding problem of agreeing on a mapping between symbols and 
sensory or even sensorimotor grounded concepts has been solved to a large extent, 
one possibly even deeper open problem remains: How do concepts and compositional 
concept structures develop in the first place? Concepts may be described as integrative 
mental representations that encode certain sensory, motor, or sensorimotor states or 
events. Compositionality, on the other hand, determines how concepts are associated 
with each other in a semantically meaningful and highly flexible manner. We argue that 
progressively complex concepts and compositional structures can be developed starting 
from very basic perceptual and motor control mechanisms. An experiment with a simple 
simulated robot gives hints about highly relevant structural ontogenetic prerequisites 
for their development. In the outlook, we conclude by sketching out the current most 
pressing challenges ahead. 

Keywords: concepts, compositionality, development, symbol grounding, language, neu- 
ral networks, manifolds, anticipation 


1 Introduction 


Symbols are “placeholders” standing for other entities. In a dictionary, and often in 


conversation, symbols are explained through other symbols. This is a potentially end- 


less process called “semiosis” by the philosopher Charles Sanders Peirce: Symbols are 


described by symbols, which are described by symbols - and so on. But how can this 


endless process be ultimately grounded, how “is symbol meaning to be grounded in 


something other than just more meaningless symbols?” (Harnad 1990, p. 340). This is 


what Harnad (1990) calls the “symbol grounding problem”. 
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While Steels (2008) states that the basic symbol grounding problem has been solved, 
it was also pointed out that yet a deeper symbol grounding problem needs to be ad- 
dressed (cf. Barsalou 2009, Harnard 1990, Sugita & Butz 2011). The robotic agents in 
Steels’ works are able to come to an agreement about a symbol convention for par- 
ticular communication realms (such as gestures, colors, etc.). That is, a common lan- 
guage is developed where particular symbols or utterances are associated with partic- 
ular perceptions or perception-action complexes. The challenge of the deeper symbol 
grounding problem lies in the development (a) of compositional concept structures from 
sensorimotor control capabilities and (b) of associations between those structures and 
grammatical, symbolic, i.e. linguistic structures. Only when these two challenges are 
accomplished, formal semantics may be actually grounded in sensorimotor codes. 

The study of both the developmental progression that led to the grounding of compo- 
sitional concepts and the nature of the involved structures and associations is expected 
to provide insights on how “Cognitive Semantics” (Johnson 1987, Lakoff 1987, Lakoff 
& Johnson 1980) actually pre-determine formal semantics and most likely even struc- 
tural properties of the universal grammar (Chomsky 1965). Most recently, the idea 
of cognitive semantics led to the proposition of a Minimalist Action Grammar (Pastra 
& Aloimonos 2012), which was directly related to the Minimalist Program by Noam 
Chomsky (1995). The Minimalist Action Grammar is a generative grammar that en- 
ables both proper generation and parsing of sentences about physical interactions. It 
binds an interaction by its final goal, combining tool complements, which are about the 
acting force, with object complements, which are about the affected object, context- and 
goal-dependently. 

We are particularly interested in how such a Minimalist Action Grammar may de- 
velop starting purely from embodied, sensorimotor interactions — in the hope to con- 
tribute to the deeper symbol grounding problem sketched-out above. The aim is to 
develop a self-motivated system that solely perceives its environment via sensory stim- 
ulations and that probes its environment by motor activities, where sensors and motors 
are coupled by the bodily morphology. Ultimately, such a model may show that many 
structures present in the Universal Grammar are grounded in sensorimotor interactions 
with the environment that are realized by an embodied agent. Meanwhile, such a line 
of research is expected to also shed light on why and how grammatical structures in 
language are structured in the way they are — hints of which can also be found in the 


Minimalist Action Grammar. 


178 


Towards Grounding Compositional Concept Structures 


Various researchers now strongly believe that sensorimotor structures and the selec- 
tive simulation of particular sub-structures set the stage for the development of com- 
positional concept structures (Barsalou 2008, Grush 2004, Pastra & Aloimonos 2012, 
Pezzulo 2011). How such structures are developed and how these structures may then 
be coupled with higher level cognitive, symbolic encodings is still an open question, 
though. While the claim that the compositionality of language may be grounded in the 
compositionality inherent in interaction competencies is not new (Johnson 1987, Lakoff 
1987), how such grounding may be learned and how compositionality may be repre- 
sented by means of sub-symbolic structures remains an open question. Arbib (2005) 
proposed a developmental pathway that leads from interactions, the mirror neuron sys- 
tem, and imitation capabilities over several further stages to linguistic competence. We 
believe that these stages are important components in the development of concepts and 
compositional concept structures. However, several other prerequisites appear manda- 
tory. 

The aim of this paper is to sketch out a path by means of which complex, compo- 
sitional concept structures are action-grounded. We propose that in order to explain 
the human capacity to generalize, to draw inductions, and to develop compositionality, 
it is not necessary to resort to innate structures. Rather, as increasingly many robotic 
architectures and even more so simulations with neural networks imply, compositional 
concept structures can be developed by a brain “from scratch”, departing from sen- 
sorimotor contingencies. Endorsing the “Cognitive Semantics” of Lakoff and Johnson 
(1980), we propose to make the next step to confirm this theory by identifying the on- 
togenetic ingredients that appear necessary to develop such semantics. Thus, we are 
interested in the architectural constraints and learning biases necessary for developing 
compositionality based on sensorimotor interactions. 

In this way, the paper also takes a stand in the nature/nurture-debate about concepts. 
In particular we propose that structures, which rationalists tend to regard as purely 
innate, are actually derivatives of sensorimotor experiences and developmental con- 
straints. Thus, we propose a nature-constraint “nurture” process, in which genetically 
determined bodily and brain developmental constraints stream cognitive development 
towards the acquisition of compositional concept structures and language readiness. 
However, only with the additionally necessary environmental interactions including 
linguistic communication can the language capacity develop. Consequently, concepts 
are grounded in the experienced interactions, but genetic predispositions bias the cog- 


nitive developmental process towards concept acquisitions. 
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We argue that purely innate structures leave no flexibility and are generally ex- 
tremely questionable due to the immense depth of the necessary structures and due 
to the fact that even innateness needs to be somehow couple such structures to percep- 
tions and actions. Thus, a core claim of this paper is that the Symbol Grounding Problem 
(Harnad 1990) can only be solved by an empiricist approach to concept acquisition. In 
contrast to Fodor’s (1975, 2008) radical claim that concepts cannot be learned, we sug- 
gest that a theory of concept learning is essential for a complete theory of cognition 
and the mind. 

In the following, we first detail a neural network architecture with which it has 
recently been shown that representational separations and multiplicative interactions 
between modules are essential ingredients for the development of compositional con- 
cept structures. We detail the type of compositional structures that were developed and 
how thus compositionality was grounded in embodied sensorimotor interactions. We 
discuss the implications of this study, but also its limitations and current most pressing 
challenges. Finally, we put the insights gained into the broader perspective on how 


concepts and compositionality may develop. 


2 An Experiment with a Simulated Robot Platform 


In a neural network simulation setup, it was shown that a second-order neural network 
with parametric bias neurons (sNNPB) is able to develop generalized behavioral con- 
trol routines, presenting the system solely with typical sensory-motor time series data 
(Sugita, Tani, & Butz 2011). This study essentially offers tentative answers to the ques- 
tion: How can compositional concept structures self-organize based on experienced 
sensorimotor interactions? Additional ingredients will be necessary to scale this ap- 
proach to more complex environments and interaction capabilities. 

In the experiment, a simulated robot interacted with colored objects. The robot was 
equipped with two wheels for controlling motion and a camera that scanned the sur- 
rounding in front of the robot. In particular, the camera reported the perceived dom- 
inant hue and color intensity values covering an area of 120° in front of the robot. The 
covered areas were partitioned into nine equally spaced sectors. The robot learned two 
types of interactions: move-to and orient-towards a particularly colored object. In the 
move-to interaction, the robot had to move to the object and stop in front of it. In the 
orient-towards interaction, the robot had to simply orient itself towards an object at 


a specific angular offset; five offsets were trained. One or two colored objects were 
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Figure 1: Robot-Environment-sNNPB interaction 


present during each interaction trial with the environment. During learning, the ac- 
tions of the robot were controlled remotely by a hard-coded control program. Figure 
1 illustrates the robot, environment, sNNPB interaction. 

In the following, we will refer to the two types of interactions as the “verbs” that 
were trained, to the different colored objects as the “objects” that were addressed in 
the interactions, and to the offsets in the orient-towards interactions as the involved 
“modifiers”. Note however that the learning system was not provided with any explicit 
indicators - neither about the “verbs” nor about the “objects” or the “modifiers” - 
that may have given clues or induced learning biases towards distinguishing “verb”, 
“object”, and “modifier” concepts. The only information given to the learning system 
was the sensorimotor time series data the robot was trained on and the information that 
particular sets of sensorimotor time series data belonged to the same type of interaction. 

The resulting sensorimotor time series data was used to train an sNNPB. An sNN is 


a traditional neural network, which is trained with backpropagation, which, however, 
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includes some “second-order” neural connections. Second order neural connections es- 
sentially are connections whose current weight values are determined by other neural 
activities. In the conducted simulations, one sub-NN mapped the visual information 
provided by the camera onto motor output transferring the information over two hid- 
den layers. The connection weights of the connections from the second hidden layer 
to the motor output, however, were determined by second-order connections. The as- 
sociated neurons were activated by a second sub-NN with one hidden layer. Input to 
this network was generated by “parametric bias neurons” (Tani 2003). Error backpropa- 
gation was used to adjust the weights of the sNNPB as well as the activities of the 
parametric bias neurons. The latter were adjusted interaction-specific, thus maintain- 
ing a vector for each type of verb-object-modifier interaction the system was trained 
on. 

After learning, the sNNPB was tested on other object constellations and on other, 
untrained verb-object-modifier interactions. For example, the sNNPB may have never 
been trained on “move-to the blue object”. Nonetheless, after learning the system was 
tested if it can generate such interactions. To do so, the activity of the parametric bias 
neurons was set to activity values that matched a small set of generated interactions 
best. After that, other constellations were tested applying these PB activities. 

The results confirmed that the sNNPB generalized over the provided sensorimotor 
time series data. It was not only able to generate similar interactions in other environ- 
mental constellations, but also to generate interactions that were only compositionally 
related to those trained on. For example, it was able to orient itself towards a particular 
colored object at a particular angle, while it only had been trained to move to such a 
colored object. Thus, behaviorally the network exhibited generalization capabilities that 
were of a compositional nature. Interactions that corresponded to verb-object-modifier 
constellations could be generated that were not trained — as long as a sufficiently large 
and distributed subset of other interactions was trained. 

Moreover, analyses of the developed sNNPB showed that a self-organized geometric- 
ally-arranged manifold structure had developed, which reflected the behaviorally exhib- 
ited compositionality. In particular, the activity vectors of the parametric bias neurons 
were considered for further analysis. A principal component analysis showed that the 
first principal component differentiated the interactions with respect to the modifier. 
The second principal component differentiated move-to from orient-towards. The third 
and fourth principal component revealed a color ring encoding, akin to the one found 


in the hue-based color encoding provided to the sensory input layer. Thus, activities 
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in the parametric bias neurons self-organized via backpropagation learning into a com- 
positional manifold structure, where the individual dimensions in the manifold corre- 
sponded to the verb, object, and modifier components of the individual interactions. 
The manifold structure enables the sNNPB to flexibly activate any meaningful verb- 
object-modifier interaction type and also allows generalizing to untrained interaction 
types. The geometric, orthogonal arrangement was akin to a compositional concept 
structure because the orthogonality enables flexible interaction concept combinations 
and the deducible geometric distances can be viewed as indicating concept similarities. 

Interestingly, also the structure of the second hidden layer — the one that maps to mo- 
tor output via the second-order neural connections — was analyzed. Strongly behavior- 
oriented sensory encodings were found. For example, one neuron switched its behavior 
from off to on when an object is in the center and very close — resulting in breaking be- 
havior when the move-to interaction is activated in the parametric bias neurons. Other 
neural activities revealed activities that may be compared to gain fields in neurons (Sali- 
nas & Sejnowski 2001, Graziano 2006): neurons responded, for example, in a sinusoidal 
fashion with respect to color but that response was linearly modulated by the direc- 
tion where the color was perceived from. In effect, this encoding allowed the flexible 
activation of particular color-respective encodings for approaching and orienting the 
robot towards particular colors, dependent on the activated mapping given particular 
parametric bias activity. From a broader perspective it can be said that object-relative 
encodings developed that encoded “object affordances” (according to Gibson 1979), in 
the sense that the encodings afforded to reach a particular orientation towards a partic- 
ular object or to stop moving when coming close to an object. Providing yet another 
interpretation, spatial, object-relative encodings were developed that could be directly 
mapped towards motor activities, yielding a flexible Braitenberg vehicle (Braitenberg 
1984). 

The network succeeded in developing these compositional concept structures with- 
out the provision of any semantic cues besides the ones that were inherent in the senso- 
rimotor time series data. Seeing that various other neural network architectures could 
not yield similar generalizations, it was concluded that (a) goal-oriented encodings need 
to be separated from sensorimotor, control-oriented encodings and (b) a multiplica- 
tive approach is best-suited to project the goal-oriented encodings onto the sensorimo- 
tor encodings for realizing flexible and compositional goal-oriented behavioral control. 
In the emergent, interaction-specific, goal-oriented encodings the mentioned composi- 


tional concept structures could be found, whereas in the processed sensory encodings 
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behavior-oriented signals could be found. Both were shown to be mutually dependent 
on each other — the former selecting the actual interaction that should be executed; the 
latter providing potential interaction options. 

Seeing that various other neural network architectures were not able to generate 
comparable compositional behavioral generalization capabilities — let alone actual iden- 
tifiable compositional structures as the one characterized above — the results suggests 
that sensory-to-motor mappings should be separated from interaction selection encod- 
ings to enable the development of compositional concept structures. Essentially, the 
interaction selection corresponds to the goal that is to be achieved, with considera- 
tions of the component that bring each particular goal about - such as moving to a 
particularly colored object. While various researchers have suggested that such separa- 
tions are behaviorally necessary (Cisek 2007), we believe they have not been sufficiently 


considered in research on the development and structure of language and cognition. 


3 Insights and Open Challenges Deducible 
from the Robot Experiment 


The results of the simulated robot experiment have shown that compositional concept 
structures could only develop in this setup when the sensory-to-motor mapping was 
separated from the goal encoding, that is, from the code that determines which sensory- 
to-motor interaction should actually unfold. Also, the time dynamics had to be different 
in the two encodings in that one goal activity had to be maintained while one full senso- 
rimotor object interaction unfolded. Moreover, it was necessary that the influence from 
the goal encoding onto the sensory-to-motor mapping was multiplicative. Finally, the 
generated sensorimotor time series data had to be separated into distinct sets with re- 
spect to particular verb-object-modifier combinations. However, no information about 
the semantics or symbolic characterizations of these particular combinations had to be 
provided. 

In consequence sensorimotor grounded compositional concept structures and behav- 
ior-oriented “Braitenberg encodings” co-developed, that is, encodings which are per- 
fectly suited to be directly mapped onto motor output activities, yielding seemingly 
goal-directed behavior (Braitenberg 1984). Braitenberg encodings are thus goal-orien- 
ted encodings, which can be selectively mapped onto actions for pursuing particular 
object interactions. Indeed, the compositional concept structures had structural simi- 


larities with the emerging Braitenberg encodings, thus enabling the selective activation 
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of particular Braitenberg codes for realizing particular object interactions. Composi- 
tionality was achieved by embedding a manifold structure into a higher-dimensional 
neural representation. The individual dimensions of the lower-dimensional (in the ex- 
periment four dimensional) manifold corresponded to the compositional verb-object- 
modifier structure. The developed “object” concept was encoded on a two-dimensional 
manifold (actually a circular manifold), mimicking the hue-based color encoding in the 
simulated sensors. Due to the emerging orthogonal arrangement of the distinct concept 
structures, the sNNPB was able to flexibly compose any verb-object-modifier interac- 
tion, even if it had not been trained. The developed compositional concept structure 
appeared to be perfectly suited to be associated with a corresponding action grammar. 

However, at this point language structures have not been successfully associated 
with developing compositional structures, yet. Sugita & Tani (2005) managed to as- 
sociate symbolic structures with similar sensorimotor time series data. However, in 
this case only a more rudimentary action grammar consisting of three possible verbs 
and six possible colors was learned. Nonetheless, Sugita and Tani (2005) succeeded in 
mutually shaping both the symbol-based linguistic encoding and the sensory-to-motor 
mapping. Thus, associating symbolic, linguistic input with developing, self-organizing, 
more complex action grammars is still a very hard challenge. 

Even when focusing only on the challenge of developing pre-linguistic compositional 
concept structures — without associating symbolic language components — however, 
additional learning biases and developmental constraints seem mandatory for scala- 
bility reasons. At the moment, the sNNPB architecture is still an extremely flexible 
learning architecture. For developing more complex compositional structures, it seems 
necessary that the learning processes are further guided by additional learning biases. 
However, overly constraint learning may not give enough room for the emergence of 
compositional concept structures, such as the manifold structure identified in the robot 
experiment. Thus, complex compositionality is likely to emerge only if a good balance 
between learning biases on the one hand and self-organization on the other hand is 
maintained. 

Another challenge lies in the fact that sets of sensorimotor time series data had 
to be explicitly distinguished when training the sNNPB, while the more autonomous 
separation of different types of interactions is desirable. While similarity thresholds 
may distinguish the sensorimotor time series data, it is very hard to find the right 
distance metric that could suitably distinguish different time series in a semantically 


meaningful way. The self-organized topology in the PB neurons of the sNNPB is likely 
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to be the best candidate, but the development of it relied on the distinctness information 
in the first place. 

We believe that several of the following ingredients will be mandatory to develop 
learning systems that can autonomously produce emergent compositional concept struc- 
tures in more complex environments. First, the incorporation of an anticipatory drive 
(Butz 2008) that stresses the capability of predicting the future based on state, context, 
and motor (force) activities seems necessary. Such an anticipatory drive may guide 
learning first towards identifying the most obvious sensorimotor contingencies in the 
sensory and motor information available to the system. Further distinctions starting 
from basic sensorimotor flow may then lead to the desired progressively more distinct 
compositional concept structures. 

Once sensorimotor contingencies are identified, sensorimotor topologies can be de- 
veloped within which particular interactions can unfold. In the simulated robot exper- 
iment, a topology was implicitly developed in the deep sensory encodings, providing 
Braitenberg codes. Similar, but further modularized encodings are necessary to enable 
the even more flexible and selective interaction with the environment using different 
means, different pathways through the environment, etc. 

Furthermore, active, information-seeking, curious behavior, caused by the anticipa- 
tory drive, may enable the more direct identification of relevant concept structures, that 
is, of sensory and motor information necessary for predicting particular consequences 
reliably. The consequent identification of contextual “concepts” that separate states into 
concepts that are relevant for particular behaviors — such as free versus occupied, heavy 
versus light, etc. — will be the result. 

Besides these learning biases derived from the anticipatory drive, the challenge of 
removing the requirement of providing distinct sets of sensorimotor time series data 
may be accomplished by introducing internal motivations. Such internal motivations 
may serve as the distinctness indicators — identifying a distinct interaction by its dis- 
tinct effect on the internal motivational state. Thus, distinct positive and negative re- 
inforcement may serve as a critical additional clue to distinguish interactions further 
into meaningful concepts. 

Finally, it seems somewhat unsatisfactory that the activity in the parametric bias 
neurons cannot be internally self-activated. To do so, the activity of the parametric bias 
neurons may be partially activated by sensory input as well - potentially enabling the 
selective activation of those interaction codes that can actually unfold in the current 


circumstances. For example, a potential interaction with a red object may only be 
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activated if a red object is present. Furthermore, the mentioned internal motivations 
may be associated with those parametric bias neuron activities that previously had 
led to a corresponding change in the internal motivational state. Consequently, the 
interaction choice may be co-determined by the internal motivations and the goals 


currently possible in the environment. 


4 Conclusions 


The robot experiment described above contributes to the solution of the symbol ground- 
ing problem, and also illuminates concept learning. One of the most vexing problems 
regarding this topic is Fodor’s problem of concept acquisition. Fodor (1975, 2008) essen- 
tially questions that fundamental concepts — those that cannot be further partitioned 
into smaller conceptual entities - can be learned. And presuming that they cannot 
be learned, he concludes that they must be innate. The details of Fodor’s argument 
are beyond the scope of this article. It suffices to state that according to most recent 
philosophical considerations, “it appears that Fodor’s problem of concept acquisition 
remains a puzzle for philosophers and psychologists to solve” (McCaffrey & Machery 
2012, p. 275). 

We propose to overcome Fodor’s “radical concept nativism” (cf. Laurence & Margolis 
2002) by a different stance towards “innateness”. This very ambiguous term may gain 
a more specific sense if it is related to embodiment. In short, we propose that the 
innateness of concepts may not be directly genetically imprinted, but concepts and 
compositional concept structures may be indirectly pre-determined to develop due to 
(a) the ontogenetic path laid-out in the genes of the organism, (b) the morphological 
constraints given by the body of the organism, and (c) the environmental reality with 
which the organism interacts. 

Fundamental concepts may indeed be innate — but actually innate in the sense of be- 
ing behaviorally embodied and pre-destined to be developed. For example, basic reflexes 
— such as the grasp reflex in infants — can foster the development of particular concepts 
— such as a concept for grasping. Separating then successful from unsuccessful grasps, 
a concept structure that specifies the prerequisites for a successful grasp develops, in 
contrast to contexts were grasps are unsuccessful. Co-developing with such a represen- 
tation is a concept of graspable entities. Realizing the effects of successful grasps, will 
expand and differentiate the grasp concept further into entities that are moveable, light 


versus heavy, spiky versus smooth, etc. The basic reflex may thus lead to the gener- 
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ation of sensorimotor interactions that can be differentiated on the one hand side by 
their perceptual differences but, and even more importantly so, by their distinct effects. 

Essentially we point-out that the combination of an anticipatory drive with an em- 
bodied, sensing and acting agent can foster the development of pre-linguistic, composi- 
tional concept structures. The anticipatory drive drives the organism to actively search 
for and learn about predictable and controllable (sensorimotor) structures in the envi- 
ronment (Butz 2008). Due to this self-controlled, embodied developmental process, the 
developing concept structures are inherently meaningful because the structures deter- 
mine predictability, controllability, and their relation to changes in internal motivational 
states. Thus, the combination of the human body morphology with its ontogenetic 
development of body and brain fosters the development of “innate” but behaviorally 
acquired compositional concept structures. 

Unitizations and differentiations in the sense of Landy & Goldstone (2005) (cf. also 
Stéckle-Schobel 2012) are fundamental processes that foster the development of compo- 
sitional concept structures. We propose that these processes are not purely perceptual 
or sensorimotor, but are developed for predictability, controllability, and achievability 
purposes. With this proposition we go one step beyond theorists of “neo-empiricism” 
like Barsalou (2009), Jesse Prinz (2002), and others. We strongly acknowledge that their 
accounts on perceptually grounded symbols and concepts are highly important in over- 
coming unworkable accounts of innateness. However, we would like to further stress 
that cognition and - more specifically, concept acquisition — is not solely shaped by 
(and for) perception. Rather, it is most important for being able to interact flexibly 
goal-directedly with objects and other agents. 

Moreover, the robot experiment has shown that spatial, object- and body-relative 
representations should be separated from goal-oriented representations in order to fos- 
ter the development of compositional structures. Given this separation, particularly 
the goal-oriented representations appear well-suited for the development of composi- 
tionality. Thus, the separation of dorsal and ventral pathway (Goodale & Milner 1992), 
which is certainly highly behaviorally relevant and mandatory for realizing flexible be- 
havioral control (Cisek 2007, Milner & Goodale 2008), may have actually set the stage 
for the development of compositional concept structures, that is, structures that allow 
the development of language in the first place. 

Certainly other processes are still highly important as well. In particular, we believe 
that the development of mirror capabilities and tool use are two fundamental additional 


ingredients. The capability of mirror neurons, which was first most likely beneficial for 
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improving mutually beneficial interactions with other individuals, fosters the further 
development of communication between individuals, by, for example, enabling the de- 
velopment of verbal imitations from gestural imitations (Arbib 2005, Rizzolatti &Arbib 
1998). The capability of handling tools led to the development of much more intense 
interactions between the dorsal and ventral processing streams, thus being able to view 
tools and objects as part of the subject and, in retrospect, also oneself as a tool (Iriki 
2006). 

However, we believe that the sketched-out processes will set the stage to be able 
to ultimately solve the mystery of concept acquisition. By separating goals from spa- 
tial topologies and events, flexible goal-directed behavior can be selected and pursued. 
Current internal goals can be flexibly pursued dependent on the current spatial con- 
straints. Moreover, the availability of potential goals in the environment as well as the 
context-dependent estimated achievability of such potential goals can yield tremendous 
behavioral flexibility and effectivity. While the development of such a separation was 
thus initially most likely purely behavior-driven, it also enabled the development of 
compositional concept structures. While potential goals and the involved concepts for 
achieving these goals are detached from the here-and-now, the encodings can be flexibly 
projected onto the current state in the environment. Meanwhile, state representations 
must have developed that enable the flexible activation of goals and involved concepts 
for pursuing particular goals. Object-referenced encodings found in in the parietal cor- 
tex (Chafee, Averbeck, & Crowe 2007) support the pro-motor representations found in 
integrative, multimodal cortical areas. The parietal-frontal interactions with which ac- 
tion goals appear to be transferred into actual movement control support their strong 
goal- and behavioral relevance (Graziano, Cooke 2008). Arguably, similar correspon- 
dences were even proposed to exist between Wernicke’s and Broca’s areas (Graziano, 
Cooke 2008). Finally, gain-modulations, which are found nearly ubiquitously in the 
brain, suggest selective, multiplicative computations in individual neurons (Salinas & 
Sejnowski 2001), supporting the flexible, goal-oriented selection of maximally suitable 
sensory-to-motor mappings. 

In the minimalist Action Grammar as proposed by Pastra & Aloimonos (2012) goals 
unify particular actions with objects and further modifiers. Our proposition in this pa- 
per gives first hints why goals are crucial both, for the development of grammatical 
structures and for being able to flexibly combine compositional concept structures to 
achieve particular goals dependent on their current urgency and achievability. Nonethe- 


less, much future research is necessary to sort the identified puzzle pieces, identify even 
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further pieces, and arrange them in the way the ontogenesis of the brain manages to 


do so beautifully. 
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Abstract 

In this paper, I investigate the relationship between natural language and thinking. 
Specifically, I adopt the view that thinking operates, by and large, according to asso- 
ciationistic rules and argue that natural language plays a crucial role in thinking, but 
not a constitutive one, as many have argued. I propose that the suggested view enjoys 
significant empirical support, mainly from work done with aphasic subjects. The major 
challenges that all associationistic views of thinking face are the problems of proposi- 
tional thinking and compositionality of thought. I briefly suggest how these challenges 
could be met in the light of the suggested view regarding thought production. 
Keywords: Language; Cognition; Associationism; Aphasia; Concept Empiricism 


1 Introduction 


The relationship between language and cognition is a much-debated one and widely 
varying notions of this relationship have been produced over the last few decades in 
fields as varied and diverse as psychology, linguistics and philosophy. The main di- 
alectic of this debate is centred on the issue of the significance of natural language in 
cognition. It is worth clarifying at this point that there is the issue of ‘whether thought 
happens in language’ and secondly ‘whether the language in which thought happens, if 
it does, is natural language’. The problem is that certain thinkers, Fodor for instance (see 
below), answer the first question emphatically ‘yes’ (language of thought), and others 
with an emphatic ‘no’. As a result, their answer to the question ‘how important is the 
role of language to thought?’ is potentially ambiguous. In the following, when talking 
about language I will be referring to natural language unless stated otherwise. 

The main strands in this debate can be briefly classified as follows. I start from 


views that bestow the least significant role to language in the production of thought, 


1 This paper is an early draft of Tillas, A. (forthcoming 2015). Language as Grist to the Mill of Cognition. 
Cognitive Processing. 
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and continue by examining views that ascribe language a greater role. Grice (1957; 
1968; 1969; 1989) treats language as independent to thought and as merely being used to 
express non-linguistic thoughts. Linguistic communication is seen as primarily a matter 
of a speaker changing a hearer’s mental states, e.g. getting them to form a certain 
belief, through recognition of the contents of their thoughts. (The hearer recognises the 
thoughts of the speaker on the basis of the latter’s usage of words). Elsewhere, Grice 
(1982) speculates that language may have evolved in order to facilitate correspondences 
in psychological states between one creature and another. Proponents of similar views 
argue for a reductive account of linguistic meaning to thought meaning. In this sense, 
language is independent from thinking. A second view can be found in Fodor’s (1978; 
1983; 1987) Language of Thought Hypothesis (LOTH). For Fodor, thinking occurs in an 
inner sub-personal code which he calls ‘Mentalese’. Mentalese is distinct from natural 
language and hence the role of natural language in thought is also limited. Language 
is mainly used for expressing the underlying thoughts in public form. Proponents of 
similar views, at least according to Carruthers (2005), include Chomsky (1988), Levelt 
(1989) and Pinker (1994), amongst others. Another view is that of Carruthers (1998; 
2005; 2008) according to which the language of thought is actually natural language. In 
this sense, natural language plays a greater role in thinking than merely communicating 
thoughts from an unconscious to a conscious level. Carruthers holds that language is 
constitutively involved in thinking and inner thinking occurs as a form of inner speech. 

Further views that bestow a significant role to language in thinking can be found 
in the works of thinkers like Davidson and Brandom who see thinking as secondary 
to language. More specifically, for Davidson (1975) thoughts are only attributable to 
creatures that are interpretable. A creature that we cannot interpret as capable of mean- 
ingful speech is a creature that we cannot interpret as capable of possessing contentful 
attitudes”. For Brandom (1994), thought does not take place in language but thought can 
only be attributed to linguistic agents. Thought and language acquire content through 
their mutual interrelations. But despite this mutual interrelation, Brandom promotes 
the significance of language over that of thought since he argues that the objectivity 
of conceptual norms derives from public linguistic practice. 

There are also views that could be seen as somehow equidistant from the two ex- 
tremes of the above continuum. The view suggested here also lies at the middle of the 


continuum, and in this section I clarify how it differs from competing views. 


2 See also Malpas (2009). 
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The beginnings of supra-communicative views of language can be traced in William 
James’ (1890/1996) idea that language, and words in particular, allow for a clearer dis- 
tinction between different concepts*. Vygotsky (trans. 1962) further analyses this idea 
and argues for the influences of natural language on cognitive development and its 
scaffolding role in guiding behaviour and directing our attention. 

This Vygotskian scaffolding idea enjoys support from the work of Berk and Garvin 
(1984) who show that language (in the form of self-directed vocal or silent speech) 
guides the actions of children of 5-10 years of age. They found that silent speech is more 
frequent in cases where the child is alone and when she is engaged in more sophisticated 
tasks. Bivens and Berk (1990) and Berk (1994) found that increased incidence of silent 
speech strongly correlated with higher levels of mastering the task in question. From 
this evidence, Berk draws the conclusion that self-directed speech is a crucial cognitive 
tool that allows us to direct our attention to specific aspects of a new situation and 
direct problem-solving actions. 

Gauker (1990) also suggests a view of language as a tool for affecting changes in 
the subject’s environment (as opposed to a tool used in representing the world or to 
publicly express one’s thoughts). Language plays the role of a medium through which 
subjects can grasp the causal relations into which linguistic signs may enter. 

For Jackendoff (1996), linguistic formulation allows us a ‘handle’ for attention and 
with it the possibility to attend to relational and abstract aspects of thought and thus 
puts us in a position to scrutinise those aspects. 

One of the most prominent views that fall under the ‘middle-of-the-continuum’ um- 
brella is that of Clark (1998), and Clark and Chalmers (1998) who argue for the causal 
potencies of language and suggest that language complements our thoughts. Here, 
the mind is seen as using external props to reduce the cognitive costs of thinking and 
enhance performance, especially in regards to formation of structurally highly sophisti- 
cated thoughts. Even though thinking can be purely internal, it often relies on available 
external resources and uses them in a constitutive way. Language is not coincidently 
available, but it rather exists to have the function of a prop for thought. Focusing on 
a connectionist view of the mind, Rumelhart et al. (1986) also treat language as a crucial 
element for various environmentally extended computational processes. 

Dennett (1991) ascribes a more ‘extreme’ role to language and argues that the ad- 


vanced cognitive skills that the human mind exhibits are the effects of culture and lan- 


3 Here I follow Clark’s (1998) terminology for views that ascribe more than a communicative role to language. 
Most view presented here are reported in Clark (ibid.) 
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guage. In this sense, the main cognitive differences between the human mind and that 
of primates like chimpanzees cannot be captured in terms of our initial hardwiring. 
An even stronger view comes from Whorf (1956) who famously suggests that linguis- 
tic differences in grammar and usage shape and alter the ways in which we come to 
conceptualise and experience the world’. 

Finally, the language of associationistic thinking hypothesis (LOATH) - the view 
suggested here — also lies somewhere at the middle of the aforementioned continuum. 
By and large, LOATH is a view that builds upon associationism and ascribes a significant 
role to natural language in terms of its contribution to thinking but crucially it is not 
a constitutive one. 

Before starting an elaboration on LOATH, I clarify a number of preliminary issues 
such as what thinking amounts to, at which point we get conscious access to our 
thoughts, and what it is for a subject to have endogenous control over her thoughts. 
Continuing, I present my views on the role of natural language in thinking and provide 
empirical evidence, mostly from work done with aphasic subjects, in support of my 
claims. Finally, I assess the consequences of my account by evaluating whether a bigger 
role should be ascribed to language. In doing so, I examine Carruthers’s argument, 


given that he treats language as constitutively involved in thinking. 


2 Elaborating on LOATH: thinking is analogous to perceiving 


Despite the fact that the role of language in thinking is often subject to a lively de- 
bate, few things are settled in regards to what thinking amounts to. For proponents 
of the view that thinking occurs in language, thinking occurs either in a Mentalese 
sub-personal code or in the form of inner speech; but as explained above not everyone 
believes that thinking does in fact happen in the form of language. In the view I suggest 
here, thinking is analogous to perceiving to the extent that the same representations 
that were produced during perception of a given object get reactivated when thinking 
about this object, (e. g. Barsalou 1999; Damasio 1989). That is, on recalling a given con- 
cept, e.g. DOG, the brain simulates, to use Barsalou’s term, the perceptual experience 
of a dog. That is, the same neuronal configurations that were active while perceiving a 


dog would also be activated when thinking of a dog; (see also Barsalou 1999; and Prinz 


4 But see Patterson and Fushimi (2006) for evidence that the brain’s organisation of language is in fact the 
same regardless of the language the subject speaks. 
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2002: esp. chap. three). At the same time, thinking is different from perceiving since 
the phenomenology of thinking is different for obvious reasons. 

Fleshing out the notion of simulation further, consider Damasio’s (1989) “conver- 
gence zones’ hypothesis. During perception of a given object, different groups of neu- 
rons underlie perception of different parts/properties of the object in question. Further 
down the line of interneural signalling, the output of the neurons that underlie percep- 
tion of a dog’s head, for instance, converge with the output of the neurons that underlie 
perception of the dog’s bark, legs, fur, etc. In this way, these different neuronal en- 
sembles interact in a way that they did not before. And they did not interact before 
because they are dedicated to the perception of different kinds of stimuli. Convergence 
zones register combinations of components in terms of coincidence or sequence in space 
and time (co-occurrence). Representations of the parts of the perceived object are re- 
constructed by time-locked retro-activation of fragmented records in multiple cortical 
regions. This is the result of feedback activity from convergence zones. That is, the 
groups of neurons that fired in a specific way during the sensory experience with the 
given object are re-activated simultaneously and in exactly the same way that they were 
activated during the initial perception of the object in question. In this way, a given ob- 
ject is not only perceived as a whole but is crucially also represented in memory (and 
later on reactivated) as a whole precisely. For what actually gets stored are the simulta- 
neous activation patterns that underlie perception of that object. A key point here is 
that we only have conscious access at the level of a convergence zone and not at the 
level of the fragmented representations of an object in geographically spread neuronal 
groups. It is for this reason that we perceive objects as wholes and not as conjunctions 
of different features and properties. This claim will play a significant role in the sec- 
ond part of the paper where I reply to Carruthers’s claims about the relation between 


language and thinking. 


2.1 Endogenously controlled thinking 


LOATH is based on a view of concepts according to which a concept is a structured 
entity comprised of a set of representations. These representations are formed during 
perceptual experiences with instances of a given kind. What is also included in this 
set is the perceptual representation of the appropriate word, e.g. (Barsalou 1999). For 


instance, the concept DOG is comprised of a set of perceptual representations built out 
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of experiences with instances of dogs, together with the perceptual representations of 
the word ‘Dog’. These representations get associated on the basis of co-occurrence. 

To have the ability to endogenously control the tokening of a given concept, and 
thus to endogenously control thinking, is to be in a position to activate a given concept 
in the absence of its referents, i.e. to token a thought on the basis of processes of 
thinking. In my view, endogenously controlled thinking is merely associative thinking, 
i.e. current thinking caused by earlier thinking. Here, I am committed to a view of 
internal thinking which is imagistic, to the extent that conceptual thoughts are built out 
of concepts, which are in turn built out of perceptual representations. In the suggested 
view, concepts are associationistic in their causal patterns. That is, every concept is 
associated with other concepts. Once activated, concepts associated to it to get also 
activated’. For example, consider someone uttering the word ‘Trip’ and another agent 
mistakenly hearing the word ‘Grip’ and as a result starting to think about friction and 
laws of physics instead of travelling. This is a case where an agent is forming a thought 
in the absence of an appropriate stimulus, seemingly in a spontaneous but actually in 
an associative manner. In the previous example, the subject in question forms a thought 
without being confronted with an instance of the kind in question, in this case the word 
‘Grip’. 

Note here that endogenous control over concepts (i.e. the ability to activate a con- 
cept in the absence of its referents) could also be acquired in different ways to the one 
suggested here. For instance, non-linguistic animals might acquire endogenous con- 
trol over their concepts by associating a given set of representations to some sort of 
non-linguistic action, e.g. goal-directed actions over which they do have endogenous 
control. This might also be the case with human subjects at early developmental stages. 
The suggested hypothesis then is that when a subject finally does acquire a certain 
degree of linguistic sophistication, the process of activating a concept in a top-down 
manner is achieved by virtue of associated linguistic symbols being activated. Note also 
that there are cases when we form a thought ‘on the fly’ by activating a set of images in 
a top-down manner and consciously manipulating those images. For instance, consider 
being in a store and trying to think whether a particular sofa would fit in your living 
room. This is a clear case when a thought is formed by virtue of images being con- 


sciously manipulated. Clearly, the activated images/representations of the inner space 


5 Evidence in support of the suggested associationistic view of thinking can be found in the work of Elman 
et al. (1996), amongst others, who argue that artificial neural networks can be highly constrained by the 
network’s current weight assignment. 
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of one’s living room do not have to be constitutive parts of the concept LIVING ROOM. 
What is important here though, is that these representations are only activated in virtue 
of their associations to certain concepts, which in turn are also activated either during 
or (right) before the activation of the imagistic thought in question. 

In a nutshell, endogenous control over thoughts is acquired by associating concepts 
with linguistic symbols. My hypothesis here is that we have endogenous control over 
our production of linguistic items, given that we are able to produce linguistic utter- 
ances at will (or silent talking to ourselves). It is this executive control over linguistic 


utterances that gives us endogenous control over our thoughts. 


2.2 Associationist accounts and propositional thoughts 


On the previous pages, I presented LOATH, an associationistic view of thinking in 
which language plays a significant but not a constitutive role in thinking. As such, 
LOATH might be subject to the objection that it cannot account either for propositional 
thinking or for compositionality of thought. However, I suggest that those problems 
could be solved by appealing to natural language. Let me elaborate. 

The reason why it is not obvious how LOATH could account for propositional think- 
ing is that it at best describes how interconnected concepts get activated but does not ex- 
plain the propositional-syntactic properties that thoughts, in the form of inner speech, 
actually have. In a sense, propositional thoughts somehow involve or are about a num- 
ber of different items for which we have individual concepts. In a propositional thought, 
those individual concepts are structured together. The way that individual concepts are 
structured is important, since the same concepts can be structured in different ways. For 
instance, there is a clear structural difference between the thought ‘John loves Mary’ 
and the thought ‘Mary loves John’ (cf. Fodor & Pylyshyn 1988). The difference between 
propositional and non-propositional thoughts is that propositional thoughts are com- 
plex structured entities that are true or false. In this sense, some thoughts seem to have 
a unified coherent propositional structure and content® whereas individual representa- 
tions seem to lack these features. The question then is how is it that we can move from 
the individual representations to having mental representations that have this kind of 
propositional content? 

In reply, a single thought gets to be propositional in structure and content by pig- 


gybacking on language. My starting point is that sentences are syntactically structured. 


6 Structure and content are different since there could be mental atoms that have propositional content. 
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Sentences are unified structured entities and they unify and structure the concepts asso- 
ciated with the components into a propositional thought in a way that mirrors the unity 
and structure of the sentence. A thought gets to have propositional content by virtue 
of concepts (for objects or features) being associated with individual words or phrases; 
the sentence provides a kind of unity. In this sense, it is the conventional grammatical 
unity and structure of the sentence that unifies those concepts and orders them in a 
certain way. It is by virtue of this, that thoughts have particular propositional con- 
tent. Furthermore, the external linguistic item orders and, in a sense, binds the different 
constitutive-to-the-proposition parts together and unifies thoughts. 

As it happens, most of those raising the objection of propositional thinking against 
associationist accounts seem to find a better alternative in LOTH. What is appealing 
about LOTH here is that Mentalese is structurally(/grammatically) analogous to natural 
language. In this way, a thought is tokened as propositional. As explained in Section 
4 below, Carruthers also objects to associationistic accounts and he favours a view in 
which natural language is constitutively involved in thinking, i.e. natural language be- 
comes a language of thought. Thus, for Carruthers, thoughts do not occur in Mentalese, 
but rather natural language is itself the medium through which conscious thinking is 
conducted. In this sense, thoughts are propositional in terms of natural language, which 
of course is propositional, being constitutively involved in thinking. Both of the above 
theses can account for propositional thoughts while it is claimed that associationist 
accounts cannot. 

As shown above, representing linguistic items allows an agent to escape from the 
patterns of association that they would have been locked into had it not been for the 
conventional structure of sentences and their conventional patterns of implications. In 
this way, an agent can extent the repertoire of these associations beyond the actual 
inductive pattern of objects as she has encountered them. For instance, one can think of 
black swans even though one has only seen white ones. This is possible because some 
of the patterns of associations that one can fall into using by the concept SWAN are 
underpinned by and arise from the conventional structure of language. So, the (version 
of the) problem of propositional thinking (that I focus on here) is solved by latching 
onto the external artefacts of public language. 

In a nutshell, I claim that an agent could extend the repertoire of associations beyond 
a) their hardware endowment and b) the patterns of experiences that their history has 
given him/her by forming associations with linguistic items. These latter associations 


are much less constrained by the agent’s individual experience history and much more 
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constrained in other ways, i.e. the rules of grammar, the norms of epistemology and so 
forth. It is in this way that thinking in a more flexible and open-ended way is achieved. 
Clearly the suggested view bears enough similarities to the Extended Mind Hypothesis 
(briefly examined above) and Clark’s (2005) suggestions. The main difference between 
the two is that my focus is at a more general level. In particular, I do not focus on specific 
cognitive tasks that might be propped up by language or how specific processes, like 
those involves in perceptual categorisation, are facilitated or influenced by language. 


Instead, my focus here is on how language affects thought formation. 


2.3 Associationism and compositionality 


Another problem that associationistic accounts of thinking face is the problem of com- 
positionality. One of the characteristics of concepts is that they can combine compo- 
sitionally. The problem for associationistic accounts is that it is not clear how they 
can give an account of the ways in which concepts, the ingredients of thoughts, can 
be put together to produce something where the meaning of the whole depends on the 
meanings of the parts and the ways in which they are put together. The problem of 
compositionality is particularly vivid for prototypes. For instance, the conjunction of 
PET and FISH gives PET FISH. However, the prototypical pet is something like a cat 
or a dog; the prototypical fish is something like a trout while the prototypical pet fish is 
rather a goldfish (cf. Fodor and LePore 1996). If thoughts are formed in associationistic 
manner, how is it that concepts can combine compositionally? 

This is a very interesting problem which, however, lies beyond the scope of this pa- 
per. That said, a solution can be suggested; one that can be seen as another way in which 
language influences thinking. My main claim is that since thinking piggybacks on lan- 
guage, the solution to the problem of how thinking is compositional piggybacks on the 
solution of how language is compositional. Admittedly, this is a different problem, and 
one on which I do not further elaborate here since it lies in the realm of philosophy of 
language. 

Returning to the problem of compositionality of thought and assuming that language 
is compositional, according to LOATH the concept PET FISH is a folder that contains 
perceptual representations. At this point, I align myself with Prinz’s semantic account 
(2002), according to which, in order for C to refer to X, the following two conditions, (a) 
& (b), have to be fulfilled: 


(a) Xs nomologically covary with tokens of C 
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(b) An X was the (actual) incipient cause of C 


In this sense, the incipient causes of PET FISH can either be instances of pet fish or 
representations of pets and representations of fish. What is important, in terms of the 
semantics, is that PET FISH has to nomologically covary with pet fish rather than a 
disjunction of pet and fish. In other words, that PET FISH will be activated every time 
the subject is confronted with or thinking of an instance of pet fish. This is a nomic or 
counterfactually supporting relation. The reason why PET FISH nomologically covaries 
with pet fish is that the concept’s functional role is constrained by the constraints on 
the uses of the word that are set by the agent’s locking into the conventions of how 
conjunctions are formed. In this sense, an agent is a participant in a convention and 
it is via the association between the word and the concept that the functional role 
of the conjunctive concept is constrained. Taking a closer look at the constitutive 
representations of PET FISH now, these representations can be representations of pets 
like cats and dogs as well as representations of fish. Note that those representations are 
idle in the functional role of the concept. The latter is more constrained by its link to 
the words. 

I do not further elaborate on the problem of compositionality here. However, it 
should be clear that even though proponents of associationist accounts of thinking do 
not have a fully fleshed out solution, they can tack the solution that philosophers of 
language will offer to the problem of how language can be compositional onto their 


claims about thinking. 


3 LOATH and empirical evidence: thoughts, language, and the 
evidence from aphasia. 


In the following sections, my target is to examine LOATH against empirical evidence. I 
do that by arguing that it is not clear how proponents of the communicative conception 
of language could account for evidence gathered from work done with aphasic subjects, 
which shows that aphasics cannot form endogenously controlled thoughts. The reason 
why this is useful for my purposes is that aphasia is generally understood as a language 
disorder. Admittedly, there are different kinds of aphasia and each kind can affect lin- 
guistic comprehension and communication to different degrees. Furthermore, several 
brain regions are affected in cases of aphasia. By and large though, aphasic subjects 


are unable to understand and use spoken or written language due to brain lesions. To 
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this extent, I focus on the linguistic aspects of aphasia’. Furthermore, even though - 
as mentioned already -language plays a key role in the acquisition of endogenously 
controlled thought, stimulus driven thought might not necessarily involve language. 
For instance, it might be that a stimulus produces a perception, which in turn causes 
activation of concepts by associationistic links that are piggybacking on language. In 
this sense, a fair quantity of stimulus-driven, yet fairly complex, cognitive processing 
can occur in aphasics. However, the suggested account predicts that there will be a 
dramatic drop in performance amongst aphasics executing sequential and reasonably 
difficult tasks and more specifically in performance of tasks in which endogenous con- 
trol of thought is required. This is because, as previously explained, a key claim of 
LOATH is that endogenous control is acquired on the basis of language, and aphasics 
are by and large subjects with ‘compromised linguistic systems’. 

In order for proponents of the view that language is not involved in endogenous 
control of thinking to accommodate evidence similar to this presented below, they need 
to establish a double dissociation between language and endogenous control. That 
is, they have to show that aphasic subjects - who are linguistically impaired - can 
nevertheless activate concepts in a top-down manner and also that (at least in some 
cases) subjects who are linguistically unimpaired cannot activate concepts in a top- 
down manner. 

In general terms, the empirical evidence presented here shows that there is a cor- 
relation between linguistic impairments and endogenously controllable thinking. Thus, 
the option available to proponents of views contrasting the one suggested here is the 
following: First of all, they need to adopt a massively modular view of the mind. In this 
case, it can be claimed that a distinct module governs activation of concepts in a top- 
down manner, and perhaps a separate module (or modules) governs all other linguistic 
functions. It can then be claimed that in the cases presented below, both the language 
module and the top-down-activation-of-concepts module are impaired. Nevertheless, 


those two modules are distinct from each other*®. If a massively modular view of the 


7 Section §3 has been significantly revised after publication of this volume. The main reason for this is that 


aphasia is not an absolute language deficit, as it is implied here, and more relevant and recent empirical 
evidence has been considered. However, in later drafts it is shown that the suggested view still enjoys 
significant empirical support from work done in perceptual processing and categorisation tasks. 


8 Evidence in support of this claim can be found in (Pinker 1994), (Brock 2007) and (Mervis and Beccera 2007). 
The latter demonstrate that language abilities in Williams Syndrome are no more than would be predicted 
by non-linguistic abilities. Furthermore there is evidence suggesting that specific language impairments 
(SLI) related to use of language might be of a more general cognitive nature (Norbury, Bishop & Briscoe 
2001); (Bishop 1994); (Kail 1994), amongst others). I do not further elaborate on this issue here. 
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mind is adopted, the aforementioned double dissociation can be achieved since there 
might be cases where only one of the above (two) modules is impaired while the other 
is spared. Note here that a Fodorian view of the mind as merely modular cannot account 
for this evidence since, in that view, there is only one language module responsible for 
all linguistic functions. There are various reasons why a massively modular view of the 
mind is problematic, even though I do not further elaborate on this issue here’. Having 
dealt with the negative argument supporting the suggested view, I now turn to positive 


considerations. 


3.1 Drawing and recollection in aphasic patients 


Gainotti et al. (1983) systematically examined the effects of aphasia on drawing from 
memory. Furthermore, they investigated the relationship between the performance of 
subjects and the clinical form of aphasia, the severity of language impairment at the 
semantic level of language integration’. They also investigated whether aphasics were 
more impaired than subjects with right-brain and left-brain injuries but without any 
aphasia. All of these results were compared to the results from a control group of 
normal subjects of the same mental age, and comparisons were drawn between perfor- 
mances of the impaired and control subjects. 

During these experiments, subjects were briefly shown drawings of simple objects 
with a characteristic shape (a nail, a pear, a key, a comb, a cluster of grapes, a table, a 
hand and an umbrella). The experimenters made sure that the subjects had analysed the 
details of the object in question and recognised it, by asking them to name the object in 
question. The experimenter then hid the object away and the subject was asked to draw 
the same object from memory. It should be noted that the instructor asked the subject to 
draw the object by naming it, i. e.: “Could you please draw the comb that you just saw?” 
This process was repeated for the ten above objects. Finally, two independent judges 
evaluated the drawings. Two points were given to drawings that contain most of the 


object’s characteristic features and thus could be easily recognised. One point was given 


° For instance, evidence from (Gregory 1970) and (Barnes, Bloor and Henry 1996) could be used to counter the 
cognitive impenetrability thesis. The cited evidence shows that cognition seems to penetrate perception. 
This in turn counters one of the main characteristics of modules, namely informational encapsulation. I 
do not further elaborate on this here. See also (Prinz 2006) for an extended attack on the modular view 
of the mind on different grounds. 


10 An impairment at the semantic level of language integration can be detected by asking patients to discrimi- 
nate the meaning of a given word by choosing from an array of semantically similar alternatives the object 
corresponding to the stimulus word. This tests the semantic level of language integration (ibid. 616). 


204 


Grounding Cognition: The Role of Language in Thinking 


to drawings that contained some of the characteristic features of the object and could 
still be recognised. Zero points were given when the drawn object was unrecognisable. 
The points given by the two judges were added and thus each subject could score a 
maximum possible score of forty. 

At a different stage of the test, subjects were tested for constructional apraxia and 
were given models and figures, ten in total, to copy. Once again, two independent 
judges evaluated the drawings (copies) on the basis of a rating system similar to the one 
described above. 

On the basis of their symptoms, aphasic subjects were divided into four major apha- 
sic syndromes (Broca’s, Wernicke’s, anomia and conduction aphasia). I will not further 
examine the different types of aphasia since, as shown from the results, such a classi- 


fication is not central for my present purposes. 


3.1.1 Results 

The mean scores obtained by aphasic subjects from the Drawing from Memory Test 
and Copying Drawing Tests are presented in table 1, and are compared to the average 
scores of normal controls and nonaphasic subjects with right- and left-brain lesions. As 
shown in the first column, aphasic subjects scored the lowest means in the drawing 
from memory test while the difference in the copying drawing test was not as dramatic. 
As a matter of fact, aphasics performed slightly better in the latter test in comparison to 
subjects with right-brain damage, which are considered by the examiners as the most 


appropriate control group, given the damaged brain areas in aphasic subjects. 


Mean scores Aphasic patients R. brain- Nonaphasic L. Normal controls 
(n=57) damaged brain-damaged (n= 23) 
(n=67) (n=44) 

Drawing 21.59 28.08 31.16 33.78 

from Memory 
Copying 33.83 33.53 37.70 37.04 
Drawings 

Table 1: Results obtained by aphasics, normal controls, and non-aphasic right and left brain-damaged patients on 


the tasks of drawing from memory and of copying geometrical drawings (adapted from Gainotti et al., 


1983). 


On commenting on the obtained results, Gainotti et al. remark that aphasics are signifi- 
cantly more impaired than any other group on the ‘drawing objects from memory’ test, 


but not on the test for the ‘copying drawing’ tests. On these grounds, they argue that 
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poor performance of aphasic subjects at the drawing from memory test is a symptom 
that cannot be considered as a particular aspect of a generic visuo-constructive disorder. 

On testing subjects with different aphasic syndromes and different levels of severity, 
the obtained results showed that the performance of the subjects was not influenced, 
at least not to a significant degree, by the type of aphasic syndrome or the severity of 
the damage. Based on these results, Gainotti et al. claim neither the type of aphasic 
syndrome nor the severity of the damage seem to be crucial with regards to the deficit 
in drawing from memory of aphasic patients. 

The most striking result for my present purposes from the Copying from Memory 
test is that aphasic subjects with semantic-lexical impairments performed systemati- 
cally poorly. At the same time, aphasic subjects with no such semantic-lexical impair- 
ments performed significantly better. In this sense, there is a strong correlation between 
aphasic subjects with semantic-lexical impairments and incompetence in the drawing 


from memory test. These results are illustrated in table 2. 


Presence of semantic-lexical Absence of semantic-lexical 
impairment (n-30) impairment (n= 27) 
Copying 33.54 35.62 
Drawings 
Drawing 17.52 26.33 
from Memory 


Table 2: Mean scores obtained by aphasic patients with and without semantic-lexical impairment (adapted from 
Gainotti et al., 1983) 


In a nutshell, the results that Gainotti et al. obtained from the aforementioned exper- 
iments show that: first of all, aphasic subjects were significantly poorer than control 
groups at the drawing from memory test. Secondly, the examiners did not detect any 
significant correlation between the type of aphasia and severity of the impairment in 
the results of the drawing from memory task. Most importantly though, a significant 
correlation was detected between poor performance at the drawing from memory task 
and disruption at the semantic-lexical level of language integration. 

The importance of these findings, for my purposes, stems from the fact that they 
explicitly show that aphasic subjects have compromised abilities with regards to ac- 
cessing representations and activating concepts stored in their memory, mainly in the 
absence of the referent of the concept in question. This claim gathers pace from the 
following facts: a) the participating aphasic subject did not suffer from any form of 


visuo-constructive disabilities; b) a significant correlation between impaired drawing 
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from memory and disruption at the semantic-lexical level of language integration was 
detected; c) aphasic subjects suffer from inabilities to use and/or understand spoken 
or written language. In this sense, the above results are suggestive of the claim that 
language renders possible, or in any case facilitates, the ability to endogenously control 
stored representations. I will try to build a stronger case for this claim by appealing to 
further empirical evidence in the following paragraphs. Before that though, allow me 
to briefly discuss a methodological issue. 

A possible argument against the methodology or the design of these experiments is 
that the subjects were not asked to draw anything from memory (but a given object). In 
this sense, Gainotti et al. cannot securely eliminate the possibility that the poor perfor- 
mance of the subjects was influenced by a short-term memory defect and not because of 
a conceptual inability to reproduce from memory the form of objects that have a char- 
acteristic shape’’. In reply, Gainotti et al., claim that this objection is unsound since the 
examiners did not ask the subjects to reproduce from memory a more or less meaningful 
object but rather tried to raise in the subject the concept of the object, by naming it, and 
then asked the subject to draw the named object. Furthermore, they claim, by reference 
to the work of Faglioni and Spinnler (1969), that it is right-brain-damaged patients, and 
not aphasics, who are particularly impaired in tasks of immediate and delayed memory 
of meaningless visual patterns. 

Gainotti et al’s results enjoy support from Bay’s (1962) claims that aphasics are 
unable to reproduce from memory the crucial characteristics of a given object due to 
a basic conceptual disorder. 


In an attempt to focus only on the conceptual (as distinct from linguistic) competences 
of aphasics, Bay (1962) conducted a different series of experiments. Aphasic subjects 
were given an incomplete drawing, e.g. a cup without a handle, and were asked 
to complete the drawing, i.e. to draw the missing part. Originally, this test was 
conducted by Meili who asked subjects to name the missing part. Meili’s target was 
to give instructions without using any verbal elements and hence to focus on the 
conceptual abilities of aphasics. Bay went a step further by asking subjects not to 
name but to draw from memory the missing part. Bay reports that not a single subject 
was able to draw the missing part unless she was unable to name it. (At a later 
stage, they asked subjects to model from memory objects of their choice in plastic 
material in order to eliminate possible errors arising from the transformation from a 
three-dimensional to a two-dimensional object. For this transformation presupposes a 
knowledge of rules, such as of perspective, which in turn cannot be presumed in all 
subjects. The results were similar to the ones from drawing). 


11 Conceptual inability is an inability to reproduce (for instance, when drawing a given object) the basic 
characteristics of the object in question. 
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Based on the results of their experiments, Gainotti et al. suggest that Bay’s sug- 
gestions could be made more specific by claiming that there is a strong correlation 
between conceptual and semantic-lexical disintegration. By stressing this relation, find- 
ings about aphasics who demonstrated excellent capabilities in drawing from memory 
can be accommodated by claiming that language disturbances in those subjects were 
due to phonological and/or phonetic disorders and not due to a semantic-lexical im- 
pairment. Had it been the case that subjects were able to think of the right answer to 
the examiner’s question but were not in a position to utter the relevant words, then the 
obtained results would not have shown anything significant about the workings of the 
cognitive system of aphasic subjects and hence could not be used in favour of the view 
presented here. 

Semantic-lexical impairments in aphasic subjects are also significantly related to 
their inabilities to understand the meaning of symbolic gestures (evidence reviewed 
in Gainotti, 1983). In a similar fashion, Gainotti et al. (1979) showed that there is a 
relation between semantic-lexical disturbances and the inability of the aphasic subject 
to appreciate relationships between pictured objects which have different levels of con- 


ceptual similarity, e. g. chair and stool, bowl and cup, etc. 


3.1.2 Interpreting the results 

From the results of the above experiments it is shown that there is a significant cor- 
relation between semantic-lexical impairments and particular deficiencies such as an 
inability to appreciate conceptual similarities between objects or understanding simple 
gesturing. The most interesting result for my present purposes is the correlation be- 
tween semantic lexical impairments and inabilities of aphasics to draw from memory. 
The reason is that recalling is a characteristic case of endogenously controlled think- 
ing. Given that aphasics have severe linguistic impairments, it might now be claimed 
that their inability to endogenously activate a concept or a thought is down to their 
linguistic impairments. This is especially the case given the characteristic relation be- 
tween semantic-lexical impairments. Here is what I mean by this. First of all, subjects 
were able to copy the perceived object and hence there were no signs of constructional 
apraxia. Also, the instructor asked the subject to draw the object in question by using 
its ‘name’ (e. g. “draw the comb that you just saw”). In this way, the instructor was in a 
position to target the subject’s linguistic competences. On these grounds, any inability 
to draw the object in question was due to the subject’s inability to think of a comb, to 


continue with the same example, or to activate their concept COMB. Had it been the 
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case that subjects were able to activate their concept COMB then perceptual represen- 
tations of combs would have also been activated and they would be able to ‘copy’ them 
from memory onto the piece of paper in front of them. From the above I suggest that we 
are able to activate a concept or form a specific thought on the basis of linguistic labels 
that we have for the concept in question. Further generalising from that, I suggest that 
a subject’s linguistic capacity is what provides endogenous control over their concepts. 

Further evidence in support of the suggested role for language can be found in Farias 
et al. (2006), who shows that drawing facilitates naming; Swindell and Greenhouse 
(1988) who study patients with right- and left-brain damage; and (Bay (1962) who shows 
that aphasics are unable to reproduce from memory the crucial characteristics of a given 


object due to a basic conceptual disorder. 


PART I 


4 Shall we give language an even bigger role? 


As mentioned above, according to Carruthers (2005), natural language is constitutively 
involved in specific kinds of human thinking, particularly in conscious propositional 
thinking. He claims that natural language is not merely a communicative tool of inner 
thinking. Rather, that natural language is itself the medium through which conscious 
propositional thinking is conducted, i.e. Mentalese is a natural language. In this sense, 
for Carruthers everyone’s Mentalese will be one of the natural languages they speak. 
(For Fodor, on the other hand, Mentalese is distinct from any natural language). 

Carruthers has two arguments in support of the claim that language is constitu- 
tively involved in thinking. 1) He uses evidence from Hurlburt’s (1990, 1993) work that 
suggests that thinking happens mostly in language, and 2) he offers a philosophical 
argument that shows that thinking has to happen in language or otherwise we will be 
‘self-alienated’. I examine both in detail below, while my main focus is on Carruthers’s 
philosophical argument. 

In regards to his first argument, Carruthers’s motivation stems from evidence from 
introspection and in particular from the work of Hurlburt, who famously uses a char- 
acteristic method for investigating inner life. Subjects are not brought into the lab and 
asked to perform some task of introspection. Rather their everyday life is interrupted by 
randomly occurring beeps and they are interviewed later on to report what was going 


on in their minds when the interrupting beep happened. Subjects reported that in a 
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significant majority of the cases, where they introspected, inner thinking occurred in 
natural language sentences. There were also cases where subjects reported that their 
thought did not occur in the form of inner speech. For Carruthers these latter cases are 
instances of a systematic illusion. That is, what we take to be non-inferential thinking 
is in fact a swift bit of self-interpretation, one that we merely do not realise. Carruthers 
provides support to his claim by referring to the work of Nisbett and Wilson (1977), 
who show that there are a number of circumstances in which subjects confabulate self- 
explanations that are manifestly false, but without realising that this is what they are 
doing. Given that for Carruthers non-inferential access to a thought means that lan- 
guage is constitutively involved in that thought, Carruthers’s claims about agents hav- 
ing a systematic illusion seem to contrast Hurlburt’s claims that there are what Hurlburt 
calls ‘amodal’ and non-linguistic thoughts. Once a subject reported that they enjoyed a 
non-linguistic thought, Hurlburt followed this up with questions asking for more de- 
tails about the thought, and subjects consistently replied that it did not involve any 
language, or images, that they had no visual phenomenology or anything similar. Note 
that Carruthers argues that the subjects in question are having a systematic illusion 
since he only allows non-linguistic thoughts to be of the form of visual or some other 
sort of images but not amodal. It should be clarified at this point that Carruthers does 
not claim that all thought is linguistic. He accepts that some conscious thoughts (images 
of some sort) can be non-propositional. What Carruthers has in mind at this point are 
exactly the sort of cases in Hurlburt’s studies where subjects reported that there were 
instances when they were not thinking in inner speech. As explained, according to Car- 
ruthers these are instances of a systematic illusion, (while for Hurlburt they are amodal 
thoughts). In line with what has been said in Section 2.1, I suggest that those thoughts 
might well be conscious manipulations of images which got activated by virtue of their 
associations to concepts that were activated either simultaneously or right before the 
imagistic thought in question. 

I have been arguing that thinking is imagistic and non-linguistic. In this sense, it 
might be argued that Carruthers’s view and the one suggested here are to a certain 
extent compatible to each other. Note though that there are crucial differences. For 
Carruthers, only some thoughts can be non-linguistic while I suggest that all thoughts 
are imagistic in some way (visual, auditory, somatosensory, emotional, etc.). Clearly, 
there is a tension between allowing space for non-linguistic thoughts and Carruthers’ 
claim that language is constitutively involved in thinking. Acknowledging this tension, 


Carruthers restricts his claims about the role of language to conscious propositional 
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thought. Crucially for present purposes, however, Carruthers asserts that imagistic 
thoughts (apart from not being fully propositional) have content that can only awk- 
wardly and inaccurately be reported in the form of a ‘that’ clause, (2005, 117). Car- 
ruthers argues that imagistic theories of meaning or imagistic theories of thought are 
not sound - as the standard arguments against them show’. On these grounds, Car- 
ruthers argues that imagistic thinking cannot colonise the whole domain of conscious 
thought, unless the images in question are images of natural language sentences. In 
the latter case, the imaged sentences will have the same causal role as the thought that 
produced them, and will thus be constitutive of conscious thinking. The view I suggest 
here is different in that thoughts and linguistic items are associated but are distinct from 
each other. 

Next, I turn to examine Carruthers’s philosophical argument in favour of the claim 
that language is constitutively involved in conscious thought. According to Carruthers, 
proponents of the communicative conception of language cannot account for the priv- 
ileged nature of introspection. The reason for this is that if language is seen as not 
essentially implicated in thinking but rather as a medium that facilitates the communi- 
cation of thought, then the kind of access an agent has to her own thoughts is analogous 
to the kind of access she has to the thoughts of a third person. Carruthers admits that 
an interpretation will have to take place regardless of whether an imaged sentence is 
constitutive of an occurrent thought or caused by the occurrence of a thought existing 
independently of it. The difference is that if a communicative conception of language 
is accepted, then the process of interpretation will occur downstream of the thought, 
i.e. a thought will be tokened first and then the representation of that thought will be 
interpreted by the agent herself, in the case of inner speech. On the contrary, in the 
cognitive conception of language that Carruthers suggests, the causal role of the token 
thought in question is dependent upon its figuring as an interpreted image. In this case, 
it is the imaged (and interpreted) natural-language sentence that results in the further 
cognitive effects characteristic of entertaining a given thought. 

Carruthers (2005, 117-8) formulates his argument that language is constitutively 
involved in conscious thought in the following way: 

1. Conscious thinking requires immediate, non-inferential, non-interpretative access 
to our occurrent thoughts, and that access is distinctively different from that of 


other people. 


12 What Carruthers has probably in mind here is arguments against verificationism and some sort of veri- 
ficationist semantics. 
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2. Occurrent propositional thoughts either get articulated in inner speech or not. 
In case they do, then inner speech is either constitutive of the thought-tokens in 
question or not. 

3. If the manipulation of natural language sentences in inner speech is not constitu- 
tive of propositional thinking, then our access to the thoughts expressed in inner 
speech is interpretative, and similar to the sort of access to thoughts of others, and 
hence such thoughts of ours do not count as conscious (by 1). 

4. The sort of access that we have to those of our occurrent propositional thoughts 
that do not get expressed in inner speech also involves self-interpretation. Hence, 
such thoughts too are not conscious (by 1). 

5. So, if we engage in conscious propositional thinking at all, then natural language 
sentences must be constitutively involved in such thinking (from 1, 2, 3, and 4). 
But we do sometimes engage in conscious propositional thinking. 


So, natural language is constitutively involved in conscious thought (from 5 and 6). 


It should be clear by this point that I agree with Carruthers that language plays a 
bigger role than merely communicating our thoughts. I believe that language empowers 
us not only to gain conscious access to our thoughts but also to shape new thoughts. 
However, I believe that Carruthers is mistaken in thinking that natural language and 
Mentalese have to be identified in order for us to be in a position to explain our non- 
inferential access to our thoughts. In other words, I believe that premise three of the 


above argument is false and hence that Carruthers’s conclusion does not follow. 


4.1 Contra Carruthers: distinguishing language from thought 


Carruthers argues that in order to have non-inferential access to our thoughts, inner 
speech needs to be constitutively involved in propositional thinking (P3). Carruthers 
is mistaken in claiming that this is the only way in which non-inferential thinking can 
occur. One alternative way to have non-inferential access to our thoughts is associative 
thinking. For instance, it might be that the transition from the word to the concept that 
has the very same content that a given word expresses is an associationistic link. In 
the suggested view, perceptual representations and words are associated in memory. In 
Damasio’s terminology, the realisation of this association occurs at the level of a con- 
vergence zone. Note that this is not a case of language being constitutive to thoughts. 
Rather it is a case of co-activation of a concept’s different subparts: perceptual repre- 
sentations of the appropriate word (A) and representations formed during perceptual 


experiences with instances of a given object (B). This occurs by virtue of an instance 
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of a word activating A, which in turn activates B resulting in the concept’s activation as 
a whole. Nevertheless, and importantly, this kind of thinking is not interpretative. It 
is not that an agent hears a word, say ‘Cat’, and then tries to guess or infer what the 
word means. Instead, on hearing the word ‘Cat’ the concept CAT is activated. In this 
sense, access to thinking is neither interpretative nor constitutive. Next, I flesh out in 
more detail the way in which non-constitutive non-inferential thinking is realised in 
the brain. First, I show that language is not constitutively involved in thinking and con- 
tinue by elaborating how associationistic thinking can be non-inferential, in the way, 
for instance, Carruthers suggests. 

As explained in the first part of the paper, I take concepts to be built out of percep- 
tual representations of instances of a given kind and also perceptual representations 
of words. In this sense, perceptual representations of objects and words are distinct 
from each other and are brought together under the process of concept formation. My 
claim then is that these representations (or rather the neurons that underlie them) are 
converged together at a level similar to that of a convergence zone. The claim that 
representations of objects and words are distinct is key here since it is partly on these 
grounds that I go against Carruthers’s claim that language is constitutively involved 
in thought formation. It is just that we only get to have conscious access at the level 
where representations of words and objects are converged. In this sense, an agent can 
only access representations of objects and words simultaneously and treat them as if 
they were constitutive parts of a concept/thought. It is in this way that I can account 
for non-inferential access to thinking. 

Going back to Carruthers’s argument, his claim was that in order to be able to ac- 
count for the immediate access to our thoughts, imaged words and thoughts would have 
to be identified - at least in the case of conscious propositional thought. In this sec- 
tion, I have shown that associationism provides an alternative way for achieving non- 
interpretative thinking without language being constitutively involved. In my view, 
the relationship between a thought and its representation in self-knowledge is brute 
causation. The particular transition between a first order thought and a second order 
thought are causally and not constitutively related. Thus, the relationship between a 
first order and a second order thought is not constitutive as Carruthers argues for but 
rather a causal associative one. 

On the basis of the claims made in this part of the paper, it is argued that thought 
and language are not constitutively connected. Because, as shown, thought can occur 


without language. And when thought does require language it is in order for thought 
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to have features like propositional form and be endogenously controllable. Given our 
basic perceptual hardware and associationism as the engines of thinking, our thought 


would not have these features had it not been for language. 


5 Conclusion 


In this paper, I have examined the relation between language and cognition. My starting 
points were that thinking is imagistic, to the extent that conceptual thoughts are built 
out of concepts, which are in turn built out of perceptual representations; and that con- 
cepts — the building blocks of thoughts — are associationistic in their causal patterns. On 
this basis, I have presented a view of thinking according to which language plays a cru- 
cial — but not a constitutive — role in thought production. I suggest that unlike available 
views, the account presented here enjoys support from independent empirical evidence 
obtained from work done with aphasic subjects, while at the same time avoids the con- 
troversies of views which maintain that inner speech needs to be constitutively involved 
in propositional thinking in order to have non-inferential access to our thoughts. I also 
argued that the associationistic account of thought production I presented in this paper 


could accommodate propositional thinking and compositionality. 


Acknowledgements 


I would like to thank Finn Spicer, Anthony Everett, Jesse Prinz, Michelle Montague, 
Oystein Linnebo, and Patrice Soom for their help and comments on earlier drafts, as well 
as audiences in the UK, Netherlands, Germany, Spain, and Greece for their feedback. 


Also, I would like to thank Paschalis Kitromilides and Byron Kaldis for their support. 


Role of the funding source 


Research for this paper was funded by the Onassis Foundation, Athens — Greece (ZF 75). 


The sponsor was not involved in the study in any way. 
6 References 


Barnes B., Bloor, D. and Henry J. (1996). Scientific knowledge — A sociological analysis. 
London: Athlone. 


214 


Grounding Cognition: The Role of Language in Thinking 


Barsalou, L. W. (1999). Perceptual symbol systems. Behavioral and Brain Sciences, 22, 
577-609. 

Bay, E. (1962). Aphasia and nonverbal disorders of language. Brain, 85, 411-426. 

Berk, L. (1994). Why children talk to themselves. Scientific American, November: 78-83. 

Berk, L. and R. Garvin (1984). Development of private speech among low-income Ap- 
palachian children. Developmental Psychology, 20, 2, 271-86. 

Bishop, D. V. M. (1994). Grammatical errors in specific language impairment: Compe- 
tence or performance limitations. Applied Psycholinguistics, 15, 507-550. 

Bivens, J. A., and Berk, L. E. (1990). A longitudinal study of the development of elemen- 
tary school children’s private speech. Merrill Palmer Quarterly, 36, 443-63. 

Brandom, R. (1994). Making it explicit: Reasoning, representing, and discursive commit- 
ment. Cambridge, MA: Harvard University Press. 

Brock, J. (2007). Language abilities in Williams syndrome: A critical review. Develop- 
ment and Psychopathology, 19, 1, 97-127. 

Carruthers, P. (1998). Conscious thinking: Language or elimination? Mind and Lan- 
guage, 13, 4, 457-76. 

Carruthers, P. (2005). Consciousness: Essays from a higher order perspective. Oxford: 
Clarendon Press. 

Carruthers, P. (2008). Language in cognition. In E. Margolis, R. Samuels, & S. Stich 
(Eds.), The Oxford handbook of philosophy of cognitive science (pp. 382-401). Oxford 
University Press. 

Chomsky, N. (1988). Language and Problems of Knowledge. MIT Press. 

Clark, A. (2005). Beyond the flesh: Some lessons from a mole cricket. Artificial Life, 11, 
1-2, 233-44. 

Clark, A. (1998). Magic words: How language augments human computation. In P. 
Carruthers & J. Boucher (Eds.), Language and thought: Interdisciplinary themes (pp. 
162-183). Cambridge: Cambridge University Press. 

Clark, A., Chalmers, D. J. (1998). The extended mind. Analysis, 58, 1, 7-19. 

Damasio, A. R. (1989). Time-locked multiregional retroactivation: A systems-level pro- 
posal for the neural substrates of recall and recognition. Cognition, 33, 25-62. 

Davidson, D. (1975). Thought and talk. In his Inquiries into truth and interpretation (pp. 
155-170). Oxford: Oxford University Press. 

Dennett, D. (1991). Consciousness Explained. New York, Little Brown and Co. 


215 


Alex Tillas 


Elman, J. L., Bates, E. A., Johnson, M. H., Karmiloff-Smith, A., Parisi, D., Plunkett, K. 
(1996). Rethinking innateness: A connectionist perspective on development, Cam- 
bridge, MA.: MIT Press. 

Farias, D., Davis, C., Harrington, G. (2006). Drawing: Its contribution to naming in 
aphasia. Brain and Language, 97, 1, 53-63. 

Fodor, J. (1978). Representations: Philosophical essays on the foundations of cognitive 
science. Cambridge, MA.: MIT Press. 

Fodor, J. (1983). The modularity of mind: An essay in faculty psychology. Cambridge, 
MA.: MIT Press. 

Fodor, J. (1987). Psychosemantics: The problem of meaning in the philosophy of mind. 
Cambridge, MA.: MIT Press. 

Fodor, J. A., Pylyshyn, Z., W. (1988). Connectionism and cognitive architecture: A 
critical analysis. Cognition 28 (1-2), 3-71. 

Gainotti, G., Miceli, G., Caltagirone, C. (1979). The relationships between conceptual 
and semantic-lexical disorders in aphasia. International Journal of Neuroscience, 10, 
1, 45-50. 

Gainotti, G., Silveri, M., C., Villa G., Caltagirone C. (1983). Drawing objects from mem- 
ory in aphasia. Brain, 106, 3, 613-22. 

Gauker, C. (1990). How to Learn a Language like a Chimpanzee. Philosophical Psychol- 
ogy, 3, 1, 31-53. 

Gregory R. L. (1970). The intelligent eye. New York: McGraw-Hill. 

Grice, P. (1957). Meaning. Philosophical Review, 66: 377-88. 

Grice. P. (1968). Utterer’s meaning, sentence meaning, and word-meaning. Foundations 
of Language, 4, 225-42. 

Grice. P. (1969). Utterer’s meaning and intentions. Philosophical Review, 68, 147-77. 

Grice. P. (1982). Meaning revisited. In N.V. Smith (Ed.) Mutual Knowledge (pp. 223-243). 

Grice, P. (1989). Studies in the ways of words. Cambridge, MA: Harvard University Press. 
New York: Academic Press. 

Hurlburt, R., T. (1990). Sampling normal and schizophrenic inner experience. New York: 
Plenum Press. 

Hurlburt, R., T. (1993). Sampling inner experience in disturbed affect. New York: Plenum 
Press. 

Jackendoff, R. (1996). How language helps us think. Pragmatics and Cognition. 4, 1. 

James, W. (1890). The Principles of Psychology (2 vols.). New York: Henry Holt (Reprinted 
Bristol: Thoemmes Press, 1999). 


216 


Grounding Cognition: The Role of Language in Thinking 


Kail, R. (1994). A method for studying the generalized slowing hypothesis in children 
with specific language impairment. Speech Hearing Research, 37, 418-421. 

Levelt, W. (1989). Speaking: From intention to articulation. MIT Press. 

Malpas, J. (2009). Donald Davidson. Entry in Stanford encyclopedia of philosophy: (http: 
//plato.stanford.edu/entries/davidson/), Last accessed: Mon, Jun 29, 2009. 

Mervis, C. B., Becerra, A. M. (2007). Language and communicative development in 
Williams syndrome. Mental Retardation and Developmental Disabilities Research Re- 
views, 13, 3-15. 

Nisbett, R., Wilson, T. (1977). Telling more than we can know: Verbal reports on mental 
processes. Psychological Review, 84, 231-59. 

Norbury, C., Bishop, D. V. M., Briscoe, J. (2001). Production of English finite verb 
morphology: A comparison of SLI and mild-moderate hearing impairment. Speech 
Hearing Research, 44, 165-178. 

Patterson K., Fushimi T. (2006). Organisation of language in the brain: Does it matter 
which language you speak? Interdisciplinary Science Reviews, 3, 201-16. 

Pinker, S. (1994). The Language instinct: How the mind creates language. New York: 
William Morrow and Co. 

Prinz, J. (2002). Furnishing the mind: Concepts and their perceptual basis. Cambridge, 
MA.: MIT Press. 

Prinz, J. J. (2006). Is the mind really modular? In R. Stainton, (Ed.), Contemporary debates 
in cognitive science (pp. 22-36), Oxford: Blackwell. 

Rumelhart, D. E., Smolensky, P., McClelland, J. L., and Hinton, G. E. (1986). Parallel 
distributed models of schemata and sequential thought processes. In McClelland, 
J. L. and Rumelhart, D. E., (Eds.), Parallel Distributed Processing: Explorations in the 
Microstructure of Cognition. Cambridge, MA.: MIT Press. 

Swindell, C. S., Greenhouse, J. B. (1988). Characteristics of recovery of drawing ability 
in left and right brain-damaged patients. Brain and Cognition, 7, 16-30. 

Vygotsky, L. S. (1962). Thought and Language. Cambridge, MA: MIT Press. 

Whorf, B. (1956). Language, Thought and Reality. New Jersey, Wiley. 


217 


Postface 


(Olaf Hauk, University of Cambridge) 


The question as to how the mind creates “mental images” of concepts and memories, 
and how we use them in communication and thought, has fascinated philosophers for 
centuries. While it seems obvious that we acquire the meaning of objects, actions, words 
and abstract entities through our senses by interacting with our environment, the ques- 
tion as to how our bodies and environment shape the representation of meaning in mind 
and brain is still highly controversial in cognitive science. The last two decades have 
seen a number of exciting developments in this area. The concept of “embodiment”, 
i.e. the idea that sensory-motor systems can be part of abstract higher-level processes 
and representations, has penetrated a wide range of scientific fields. As demonstrated 
in these conference proceedings, the topic is discussed in literature studies, theoreti- 
cal and computational linguistics, psycholinguistics, and cognitive neuroscience. From 
Chinese characters to metaphors and Shakespeare - the involvement of sensory-motor 
representations is part of the debate. 

The analysis of every-day usage of language, the measurement of reaction times in 
laboratory tasks, or imaging the brain activity during language comprehension have 
provided us with a wealth of data on the role of sensory-motor knowledge in language. 
However, the excitement over the interdisciplinarity of this research area also comes at 
a cost: Are we all talking about the same things when we talk about embodiment or the 
role of sensory-motor systems? While for some researchers embodiment manifests 
itself in the different usage of verbs in metaphors, others require changes in brain 
activity in specific parts of the cortex. The conference “Sensory Motor Concepts in 
Language and Cognition” in Diisseldorf provided an ideal forum to discuss issues like 
these, and brought together world-leading experts from several relevant disciplines. 

It became apparent that the abstract concept of embodiment is itself embodied in 
different ways in language corpora, reaction times and brain activation. We may not all 


ask the same questions. But connecting different theoretical approaches will help us to 
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ask better questions, and introducing each other to different methodological approaches 
will help us answering them. Let us hope that our research will be embodied in more 


conferences like this in the future. 
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